JPMC are looking to develop a core set of set of data management capabilities to drive consistency across each line of business. This data platform will be deployed on premise and longer term in the public cloud. The initial focus is on sourcing, storing, enriching and making available information to supporting internal management reporting, external regulatory reporting, as well as machine learning and other data analysis applications.
We are seeking an experienced software engineering lead in our global Site Reliability Engineering (SRE) team supporting our Big Data platform. This individual will be expected to lead a team of software engineers who will grow into subject manage experts, work with functional application development teams, partner with infrastructure engineers and production support analysts to determine requirements for designing and developing automation, SDLC and development environment testing & integration tools. The toolsets developed must pass the rigour of JPMC's cyber security standards.
The SRE team runs, maintains and improves the Big Data Platform against established Service Level Objectives by applying software engineering practices. It is responsible for the availability, performance, change management, monitoring, and capacity management of their services, with special emphasis being placed on the automation of the processes/workload in support of the above. The SRE team is also responsible for the operational support of the Big Data infrastructure, with emphasis being placed on the ability to submit outage/issue/incident data into a design and SDLC feedback loop to ensure maximum automation and outage avoidance.
Key responsibilities in this role would include:
* Engage with development teams throughout the life cycle of incident, ensure lessons learned are translated into automated or process adjust responses to help develop software for reliability and scale, ensuring minimal refactoring or changes * Code, test and deliver software to automate manual operational work * Troubleshoot incidents, participate in blameless post-incident evaluations and ensure permanent closure of incidents * Identify application patterns and analytics in support of better service level objectives * Analyze self-healing and resiliency patterns and contribute to software which can use these outcomes * Implement best in class monitoring frameworks to accomplish end to end flow monitoring and noiseless alerting
Key Qualifications include:
About Jpmorgan Chase & Co.
JP Morgan Chase is a financial services provider that offers investment banking, asset management, treasury, and other services.