Director - Site Reliability Engineering
Req #: 190034942
Location: Jersey City, NJ, US
Job Category: Technology
Our Global Technology Infrastructure group is a team rewarded with innovators who love technology as much as you do. Together, you'll use a disciplined, innovative and a business focused approach to develop a wide variety of high-quality products and solutions. You'll work in a stable, resilient and secure operating environment where you-and the products you deliver-will thrive.
As Director - Site Reliability Engineering you will be charged with leading a team , developing and building solutions to support our internal DevOps teams. Key responsibilities of the SRE team are:
* Develop monitoring and telemetry solutions to aide DevOps teams
* Provide oversight for the creation and maintenance of Service Level Objectives (SLO), root cause analysis, stakeholder management and communication
* Identify key business flows
* Agree upon availability calculations for the business flows
* Ensure ability to measure/calculate availability via automation
* Facilitate the definition of SLOs for flows
* Assist teams in the efficient, sustainable achievement of SLOs
This role requires a wide variety of strengths and capabilities, including:
BS/BA or higher level degree or equivalent experience
Provide technical expertise throughout the software lifecycle including design, implementation and delivery.
Understanding cloud , virtualization, APIs, and modern software languages
Software development experience in one or more general purpose programming languages: Python, Java, C, C++, Go, AngularJS
Experience with developing frameworks that help increasing developer and release velocity, improving code health and technical standards
Understanding or have experience with agile and lean philosophies
Ability to collaborate with different roles to achieve common goals
Experience with one or more cloud platforms like Cloud Foundry, Mesosphere, Kubernetes, AWS, GCP, Azure
Hands-on experience with cloud deployment, monitoring, and ops analysis tools such as Kubernetes, Prometheus, Elasticsearch, Grafana, Kibana, Splunk, DynaTrace, etc.
Experience in building event monitoring solutions using tools like FluentD, Kafka, etc.
Understanding of Network and Cloud Technologies, i.e. Security, Load Balancing, and Network Routing Protocols
Experience working with Architecture teams to design reusable patterns to deploy to applications
Provide governance around adoption, and influence software engineering teams on roadmaps and designs
When you work at JPMorgan Chase & Co., you're not just working at a global financial institution. You're an integral part of one of the world's biggest tech companies. In 14 technology hubs worldwide, our team of 40,000+ technologists design, build and deploy everything from enterprise technology initiatives to big data and mobile solutions, as well as innovations in electronic payments, cybersecurity, machine learning, and cloud development. Our $9.5B+ annual investment in technology enables us to hire people to create innovative solutions that will not only transform the financial services industry, but also change the world.
At JPMorgan Chase & Co. we value the unique skills of every employee, and we're building a technology organization that thrives on diversity. We encourage professional growth and career development, and offer competitive benefits and compensation. If you're looking to build your career as part of a global technology team tackling big challenges that impact the lives of people and companies all around the world, we want to meet you.
About JPMorgan Chase
JP Morgan Chase is a financial services provider that offers investment banking, asset management, treasury, and other services.