Site Reliability Engineer - Big Data Platforms
Req #: 190025947
Location: Plano, TX, US
Job Category: Technology
SRE L2 Engineering - Associate
JPMC are looking to develop a core set of set of data management capabilities to drive consistency across each line of business. This data platform will be deployed on premise and longer term in the public cloud. The initial focus is on sourcing, storing, enriching and making available information to supporting internal management reporting, external regulatory reporting, as well as machine learning and other data analysis applications.
We are seeking an experienced software engineering lead in our global Site Reliability Engineering (SRE) team supporting our Big Data platform. This individual will be expected to lead a team of software engineers who will grow into subject manage experts, work with functional application development teams, partner with infrastructure engineers and production support analysts to determine requirements for designing and developing automation, SDLC and development environment testing & integration tools. The toolsets developed must pass the rigour of JPMC's cyber security standards.
The SRE team runs, maintains and improves the Big Data Platform against established Service Level Objectives by applying software engineering practices. It is responsible for the availability, performance, change management, monitoring, and capacity management of their services, with special emphasis being placed on the automation of the processes/workload in support of the above. The SRE team is also responsible for the operational support of the Big Data infrastructure, with emphasis being placed on the ability to submit outage/issue/incident data into a design and SDLC feedback loop to ensure maximum automation and outage avoidance.
Key responsibilities in this role would include:
* Engage with development teams throughout the life cycle of incident, ensure lessons learned are translated into automated or process adjust responses to help develop software for reliability and scale, ensuring minimal refactoring or changes
* Code, test and deliver software to automate manual operational work
* Troubleshoot incidents, participate in blameless post-incident evaluations and ensure permanent closure of incidents
* Identify application patterns and analytics in support of better service level objectives
* Analyze self-healing and resiliency patterns and contribute to software which can use these outcomes
* Implement best in class monitoring frameworks to accomplish end to end flow monitoring and noiseless alerting
Key Qualifications include:
* Bachelor's Degree in Computer Science, Engineering or Business
* Strong knowledge and experience in DevOps and Agile teams
* Strong knowledge and experience across multiple platforms, including Cloud architecture
* Knowledge/experience in Hadoop environment administration, release deployments to HBase, supervising Hadoop jobs, performing cluster coordination services will be preferable
* Knowledge of Unix/Linux administration, Unix scripts and platform level orchestration scripting. Should be knowledgeable about automating the build and deployment process.
* Knowledge in Python
* Knowledge of DB technologies (Oracle, MS SQL DB, Sybase, etc)
* Familiarity with Control M and AutoSys job scheduler
* Knowledge and experience in Web based applications / architecture (Certificates, IIS, Web Services)
* Knowledge of GIT, BitBucket, Jenkins, SONAR, SPLUNK, Maven, AIM and Continuous Delivery tools.
* Knowledge of Load balancing, IP, DNS
* Knowledge of Cloud (private cloud, public cloud etc) working experience of cloud environments like AWS is a plus.
* Ability to work directly with AD, Business and Operators
* Excellent communication skills, both written and oral appropriately scaled for technical or business audience
* Excellent interpersonal skills, team player
* Strong analysis, research, investigation and evaluation skills, with a structured approach to problem solving
* Ability to work and effectively prioritize in a highly dynamic work environment that includes a global focus
About JPMorgan Chase
JP Morgan Chase is a financial services provider that offers investment banking, asset management, treasury, and other services.