* BS or MS degree in computer science or related field
* Experience with DevOps processes and culture
* Experience in establishing error budgets
* Experience with DevOps tools (e.g. Jenkins, GoCD, CircleCi, GitHub, Ansible, Puppet, Chef, Terraform)
* Minimum 5 years of experience in software engineering (Java, Ruby, Node.js, and others)
* Minimum 1 year in reliability and performance engineering role
* Comfortable in writing code using the same code quality and engineering practices expected from core application developers (Test driven development, Integration, Security, Acceptance testing)
* Understanding of web and database technologies, concepts and design elements of on premise, cloud based and hybrid architectures
* Strong understanding of presentation tiers (Apache, Tomcat, Nginx, IIS)
* Experience with containerized environments (Dockers, DC/OS, Kubernetes, Docker Swarm)
* Experience with at least one of the major cloud IaaS providers (AWS, Azure, Google)
* Strong hands on skills with server operating systems and environments (Linux, Windows)
* Understanding of standard networking protocols and components such as HTTP, DNS, TCP/IP, ICMP and Load Balancing
* Knowledge of (shell) scripting languages (Python, PowerShell, Perl, etc.)
* Working experience with monitoring and data analysis tools (Splunk, Nagios, Prometheus, AppDynamics, etc.)
Who You'll Work With
The Technical Lead - Application Site Reliability Engineer (SRE) has a strong software development background with a real passion for application reliability, scalability and supportability. You should also have sound knowledge of developing, testing, monitoring and deploying applications in containers using a variety of container orchestration platforms (e.g. Kubernetes, Marathon etc). (S)he should have contributed to open source projects, should have participated in technical communities and interest groups, or should have presented working experiences about DevOps culture and related technical concepts in other public forums.
You will work closely with product teams to guide and enable them on setting up robust CI/CD pipelines, ensure product teams incorporate aspects of reliability, testability and supportability during various stages of product development lifecycle. An App SRE should passionately work towards advancing DevOps principles: e.g. automation, TDD, secure coding practices etc. App SRE will also develop tooling necessary to address any supportability needs of our product teams. (S)he should be passionate about good engineering practices such as test driven development, automated testing and deployments, continuous integration, etc. At the same time, a successful candidate should be a good mentor. (S)he should inspire their peers and more junior team members to learn and expand their skill set, guiding them on SRE and DevOps culture and principles. As a Technical lead in Application Development team, you will embrace a non-hierarchical trust and transparency and should be comfortable challenging the status quo.
What You'll Do
You will work with various product teams using a variety of technologies to ensure high application reliability, scalability and availability.
In this role you will manage multiple complex projects and deadlines simultaneously. You will focus heavily on an automation and orchestration strategy for existing and new McKinsey applications/products. You will also pair with platform Site Reliability Engineering to build necessary infrastructure such as code components.
You will help to troubleshoot software application issues and identify modifications needed in supported applications to ensure high reliability and availability. Additionally, you will assist Application Product teams with integrating standard DevOps Foundation Products and help advance Site Reliability Engineering role definition in general.
McKinsey & Company is an equal opportunity employer.
McKinsey and Company is a management consulting firm serving commercial, government, and not-for-profit organizations.