Job Directory Reliability Engineering Lead

Reliability Engineering Lead
Chicago, IL

Companies like
are looking for tech talent like you.

On Hired, employers apply to you with up-front salaries.
Sign up to start matching for free.

About

Job Description

Are you energized by helping organizations protect their data and build client trust? Do you want to work in one of the world's largest holistic internal cybersecurity organizations? If you're interested in proactively preventing, detecting, and responding to cyber attacks across a complex global footprint, then Deloitte Global could be the perfect place for you. We're looking for an analytical thinker passionate about cybersecurity to join and support our team.

We are transforming our IT Operations on a global basis and looking for someone with Reliability Engineering leadership skills, and a software engineering background. Unlike anywhere else in the industry, we are creating roles and teams that combine deep software knowledge with operations to drive unmatched service reliability.

Our mission is to deliver services that matter and achieve and sustain operational excellence. You will be at the heart of fullfilling our mission by bringing your software development experience to the table to own and help our vision of engineering reliability end to end. You will design and implement continuous improvement of the management, design, and function of our operational environments to achieve speed and reliability to enable business agility and happy users.

You will be part of our technology organization and have a great opportunity to work across various parts of Deloitte, including our development teams and other stakeholders to drive reliabilty upstream in the application lifecycle and across our operational environments.

Technical expertise is critical in order to imagine and drive technical improvements across our database, networking, and infrastructure teams, and to partner with our application teams, implementing more robust and performant applications for our internal solutions and business solutions (Tax, Audit, Consulting, Finance and Advisory Services).

You should be someone excited with the challenge of bringing new thinking to operations and is passionate about imaginging and implementing improvements and relentlessly pursues excellence, is a deep and broad technical expert, and can build trusting relationships across teams.

It's a new and exciting role to drive our organization further in world class operations.

Work you'll do:

As part of the Global Cybersecurity team, this professional will hold broad responsibilities will be to work with customers to deliver technical assessments against a broad range of services, illustrative duties will include:

* Assume responsibility and delivery of the Server Protection Operations Model, ensuring it operates effectively, efficiently and reliably, enabling effective operational issue resolution and escalation.


* Lead multiple teams of Reliability / DevOps engineers who automate & build release pipelines, infrastructure, cloud platforms and Operational tasks.


* Manage end to end availability, security, and performance of mission-critical services


* Providing leadership, architecture, development, and project management expertise in making our systems fail rarely, and are fast to fix when they do fail


* Drive reliably systems engineering design and recovery by minimizing manual involvement and leading continuous improvements that create an operating environment that includes dynamically monitoring, alerting, and automated self-healing and recovering


* Identify and/or analyze problems relating to mission critical services and manage the building of automation to prevent problem recurrence; with the goal of automating response to all non-exceptional service conditions.


* Engage in service capacity planning and demand forecasting, software performance analysis and system tuning.


* Oversee the incident management and drive root cause analysis initiatives to identify continuous improvements


* Drive Operational Testing and Performance Engineering to certify solutions and provide critical thinking & recommendations to meet availability and performance targets


* Improve our monitoring, troubleshooting, and resolution capabilities


* Create clear presentations and communication to stakeholders that highlight the impact of the issues and solutions to service disruptions. Communicate the state of reliability to prioritize technical debt & improvements on technology team roadmaps. Equally capable at presenting analyses and recommendations to leadership or discussing the technical merits of solutions with engineers and architects.


* Lead, build, and grow a diverse team across geographies toward a common goal; partner with our application development and project management teams on coordinating investigations into customer facing service issues.


* Own the day-to-day health, uptime, monitoring, and reliability of services and server infrastructure


* Lead, own, model and drive DevOps culture and behaviors


* Practice and enforce Agile and Scrum methodologies


* Ensure user visible uptime and quality, providing operational and development expertise in making our systems fail rarely, and are fast to fix when they do fail


* Participate in architecture and design reviews to provide recommended improvements to the development teams to improve the reliability and performance of applications


* Minimize manual involvement by imagining & implementing continuous improvements that create an operating environment, including the development of new tools, dynamically monitoring, alerting, & automated self-healing & recovery


* Identify and/or analyze problems relating to mission critical services and implement automation to prevent problem recurrence; with the goal of automating response to all non-exceptional service conditions.


* Engage in application performance analysis and system tuning, and capacity planning


* Perform root cause analysis to identify & implement continuous improvements


* Capable of presenting analyses and recommendations to leadership or discussing the technical merits of solutions with engineers and architects.


* Own the day-to-day health, uptime, monitoring, and reliability of services and server infrastructure


* Practice Agile and Scrum methodologies



This Deloitte Global role requires limited to no travel.

What you'll be part of-our Deloitte Global culture:

At Deloitte, we expect results. Incredible-tangible-results. And Deloitte Global professionals play a unique role in delivering those results. We reach across disciplines and borders to serve our global organization. We are the engine of Deloitte. We develop and lead global strategies and provide programs and services that unite our network.

In Deloitte Global, everyone has an opportunity to lead. We see the importance of your perspective and your ability to create value. We want you to fit in-with an inclusive culture, focus on work-life fit and well-being, and a supportive, connected environment; but we also want you to stand out-with opportunities to have a strategic impact, innovate, and take the risks necessary to make your mark.

Deloitte Global supports our talented professionals in answering the question: What impact will you make?

Who you'll work with:

The Deloitte Global Cybersecurity function is responsible for enhancing data protection, standardizing and securing critical infrastructure, and gaining cyber visibility through security operations centers. The Cybersecurity organization delivers a comprehensive set of security services to Deloitte's global network of firms around the globe.

This role is based in the Americas. Relocation assistance may be considered on a case by case basis.

How you'll grow:

Deloitte Global inspires leaders at every level. We believe in investing in you, helping you embrace leadership opportunities at every step of your career, and helping you identify and hone your unique strengths. We encourage you to grow by providing formal and informal development programs, coaching and mentoring, and on-the-job challenges. We want you to ask questions, take chances, and explore the possible.

Benefits you'll receive:

Deloitte's Total Rewards program reflects our continued commitment to lead from the front in everything we do - that's why we take pride in offering a comprehensive variety of programs and resources to support your health and well-being needs. We provide the benefits, competitive compensation, and recognition to help sustain your efforts in making an impact that matters.

To be considered for this role, there are certain qualifications you'll have to have. And others that would be really, really nice.

Required:

* Strong Software Engineering Experence


* Some knowledge of Azure Services, especially ARM templates


* Strong experience with Azure DevOps,TFS 2010+, VSTS, or similar ALM tool


* Strong experience with PowerShell


* Experience developing in a software development language (e.g., preferably C#/C++)


* Experience and knowledge of database technologies, particularly MS SQL


* Knowledge of virtualization and its benefits for improving reliability


* Strong experience with instrumentation, monitoring, alerting, and responding relative to performance and availability of applications


* Capable of technical deep dives into infrastructure, databases, and application, specifically in designing, coding, operating, and supporting high-performance, highly available services and infrastructure


* Experience in designing for failure, including disaster recovery and business continuity planning


* Experience operating and supporting mission-critical applications (e.g. incident and outage management)


* Passionate for making things better and driving action with a sense of urgency


* Experience problem solving issues on globally distributed systems and critical product service environments


* Knows what is possible using latest networking, infrastructure, database, and application technologies to driving automation and reliability improvements


* Brings new thinking to challenge existing technology and processes


* Excellent at building relationships across teams


* Firm sense of accountability and ownership


* Desire to understand our businesses and users


* Understanding of the concepts and principles behind DevOps, Continuous Delivery, Agile, Lean, etc.


* Use of DevOps tools to deliver and operate end-user services a plus (e.g., Chef, New Relic, Puppet, etc.)



Education:

* BS or higher degree in Computer Science/Engineering or related field



All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, or protected veteran status, or any other legally protected basis, in accordance with applicable law.

Disclaimer: Nothing in this job description/posting shall constitute an offer or promise of employment. If you are not reviewing this job posting on our Careers' site (jobs2.deloitte.com) or one of our approved job boards we cannot guarantee the validity of this posting. For a list of our current postings, please visit us at jobs2.deloitte.com

Requisition code: DE19USAGTS006FF2166

*
*
*
*
*
*

Let your dream job find you.

Sign up to start matching with top companies. It’s fast and free.