As a not-for-profit organization, Partners HealthCare is committed to supporting patient care, research, teaching, and service to the community by leading innovation across our system. Founded by Brigham and Women's Hospital and Massachusetts General Hospital, Partners HealthCare supports a complete continuum of care including community and specialty hospitals, a managed care organization, a physician network, community health centers, home care and other health-related entities. Several of our hospitals are teaching affiliates of Harvard Medical School, and our system is a national leader in biomedical research.
We're focused on a people-first culture for our system's patients and our professional family. That's why we provide our employees with more ways to achieve their potential. Partners HealthCare is committed to aligning our employees' personal aspirations with projects that match their capabilities and creating a culture that empowers our managers to become trusted mentors. We support each member of our team to own their personal development-and we recognize success at every step.
Our employees use the Partners HealthCare values to govern decisions, actions and behaviors. These values guide how we get our work done: Patients, Affordability, Accountability & Service Commitment, Decisiveness, Innovation & Thoughtful Risk; and how we treat each other: Diversity & Inclusion, Integrity & Respect, Learning, Continuous Improvement & Personal Growth, Teamwork & Collaboration.
* We are looking for a self-motivated Big Data Architect to join our data engineering team.
* We are looking for an accomplished Big Data Architect with strong experience in the Hadoop Data lake architecture and implementation.
* This role will involve a close collaboration with our team of passionate and innovative big data specialists, application developers and product managers.
* This is a unique opportunity to be a Lead Architect in our Corporate Data Engineering Team, tackling our toughest and most exiting data lake challenges across multiple divisions in the firm.
* Design, Develop, construct, test and maintain architectures such as Data Lake, large-scale data processing systems
* Big data ecosystem related Tool selection and POC analysis
* Gather and process raw data at scale that meet functional / non-functional business requirements (including writing scripts, REST API calls, SQL Queries, etc.)
* Develop data set processes for data modeling, mining and production
* Integrate new data management technologies (Collibra, Informatica DQ..) and software engineering tools into existing structures
* The candidate will be responsible for leading Architecture for building a new Data Lake, expanding and optimizing our data platform and data pipeline architecture, as well as optimizing data flow and collection for cross functional teams.
* The ideal candidate is an experienced data pipeline builder who enjoys optimizing data systems and building them from the ground up.
* The individual will lead solution implementation leading and collaborating with our Software Developers, Database Architects, Data Analysts and Data Scientists on Big data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects.
* They must be self-directed and comfortable leading the architecture needs of Big data eco system and analytics solutions with multiple teams, systems and products.
* Build the Hadoop infrastructure required for optimal extraction, transformation, and loading of data from traditional/legacy data sources.
* Work with stakeholders including the Management team, Product owners, and Architecture teams to assist with data-related technical issues and support their data infrastructure needs.
* Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
* 5-7 Years of experience architecting and building Data Lake, Enterprise Analytics Solutions, and optimizing 'big data' data pipelines, architectures and data sets.
* 5 years of experience leading Scrum teams, Architects, Technology professionals
* Advanced hands-on SQL knowledge and experience working with relational databases for data querying and retrieval.
* Experience with Design and Architecture of big data frameworks/tools: Hadoop, Kafka, Spark, etc.
* Experience with Design and Architecture of relational SQL and NoSQL databases, including MS SQL Server, Hive, HBase.
* Experience with Design and Architecture of data security.
* Experience with building processes supporting data transformation, data structures, metadata, dependency and workload management.
* Experience leading and working with cross-functional teams in a dynamic environment.
Experience building Big data pipeline with Java and/or Python a plus.
* 5-7 Years of Experience with Hadoop based technologies (e.g. hdfs, Spark). Spark Experience is must
* Strong SQL skills on multiple platform (preferred MPP systems)
* Leading development of Data Lake Architectures from scratch
* Data Modeling tools (e.g. Erwin, Visio)
* 5 years of Programming experience in Python, and/or Java
* Experience with Continuous integration and deploymentSkills Required
* Expertise in the Hadoop Data Lake and relational Data Warehouse platforms
* Demonstrated experience in Hadoop big data technologies (Cloudera, Hortonworks), Data Lake development
* Experience with real time data processing and analytics products
* Experience with at least one Cloud based data technologies (AWS, Azure, GCP)
* Cloudera or Hortonworks certification preferred
* Cloud certification would be preferred
* Experience managing engineering professionals
* Large data warehousing environments in at least two database platforms (Oracle, SQL Server, DB2, etc)
* Programming experience in Python, Java, SQL, good to have .Net, C#
* ETL, data processing expertise in Hadoop (map-reduce, spark, sqoop) and SSIS, HealthCatalyst, Informatica
* Familiarity with data governance and data quality principles, good to have experience with data quality tools
* Ability to independently troubleshoot and performance tune in large scale data lake, enterprise systems
* Knowledge of data architecture principles, data warehousing, agile development, DevOps methodologies
* Understanding of change management techniques, and the ability to apply them
* Excellent verbal and written communication skills, problem solving and negotiation skills
* Act as an effective, collaborative team member
* Office setting, with some local travel between Partners Healthcare System sites
* May require occasional travel for training
* Strong Unix/Linux skills
* Experience in petabyte scale data environments and integration of data from multiple diverse sources
* Cloud computing - Azure, AWS; machine learning, text analysis, NLP & Web development experience is a plus
* Healthcare experience, most notably in Clinical data, Epic, Payer data and reference data is a plus but not mandatory