Job Directory Site Reliability Engineering Manager

Site Reliability Engineering Manager
Newark, CA

Companies like
are looking for tech talent like you.

On Hired, employers apply to you with up-front salaries.
Sign up to start matching for free.

About

Job Description

RMS is seeking a Site Reliability Engineering Manager to lead a US-based SRE team. You will play a critical and visible role in delivering and supporting our next generation platform.

We are building a new platform that:

* is a highly scalable, cloud-based SaaS offering that helps our clients understand and manage risk
* is based on Linux, Java, and open source technologies, and leverages the latest advances in database tools, vector processing, hardware-based acceleration techniques, and geographic visualization tools
* utilizes a unique Big Data approach scaling to massive sizes over time and large scale distributed data processing technology

About you:

You are driven by professional curiosity and a desire to develop a deep understanding of the services and the technologies they depend upon.

You are passionate about automation and can demonstrate practical knowledge of various aspects of distributed service design, such messaging protocols, caching strategies, persistence technologies, and queuing.

You have the ability to understand and explain the effect of product architecture decisions on the ability to run as distributed systems.

You are deeply technical and have the ability to dig in and get your hands dirty when needed.

About the Role:

* You will manage a kickass team of SREs, providing vision and leadership
* Foster the adoption of software and systems engineering approaches within the team and mentor junior SRE's in their growth into mature SREs
* Manage the work and priorities of the team, to facilitate the reduction of toil work and establish a great toil vs development work balance
* Partner with our extremely talented development teams to help them build reliable and scalable services, and resolve any production issues as quickly as possible.
* Champion service reliability, observability, and supportability.
* Manage Incident response/resolution, service restoration, and incident prevention.
* Identify gaps in processes, skills, tooling, technology choices and work with upper management to drive improvements within the organization.
* Lead by example, care for your team and establish credibility with the quality of your and your team's technical execution.
* Stay abreast of the latest SRE methodologies, and skillfully adopt the appropriate ones
* Be a change agent, with the ability to skillfully and strategically implement the SRE vision
* Recruit, mentor, retain, and grow top-notch talent

Requirements

* 6+ years related experience in a hands-on technical role such as SRE, Systems Administrator and/or Development Engineering
* 2+ years experience leading and managing a team of engineers
* Knowledge of cloud computing patterns
* Experience supporting and deploying platforms/cloud applications on AWS
* Experience with Container and Container Management technologies, such as Docker and Kubernetes
* Good understanding of microservices concepts/architecture and design patterns
* Experience with Big data and analytics technologies
* Experience coding in Java, Python, and shell
* DevOps skills with Jenkins, Terraform, Ansible
* Experience and knowledge of systems monitoring and logging
* Experience working with developers to instrument applications
* Knowledge of Infrastructure Security and compliance
* Familiarity with ITIL-based incident, problem, and change management

About RMS:

There's a 5% chance that a hurricane will cause $60 billion of insured losses next year and a 1% chance an earthquake will cause $50 billion of insured loss in the next 12 months. At RMS, we build the simulation models that allow insurers and investors to understand portfolio risks due to catastrophes: natural catastrophes (hurricane, earthquake, flood), terrorism, pandemic, and changes in life expectancy.

We are one of the most exciting companies you've probably 'never' heard of unless you're one of our hundreds of clients in the (re)insurance, banking or hedge fund sector. We lead an industry we helped pioneer and ultimately our work makes a true impact on the world at large. How we understand and manage risk affects everybody and our passion is nothing less than creating a more resilient world through a better understanding of catastrophic events.

We are evolving our vision by delivering future solutions in the cloud, our cutting-edge risk management platform for the global risk market. RMS will create a holistic and integrated view across the enterprise with one platform for all models, all points of view, and all data. All will be run as equal partners on RMS.

RMS has 1,200 employees in 11 countries, including offices in Newark (CA-USA), Noida (India), London (UK), Hoboken (NJ-USA), and Zurich (Switzerland).

To find out more, visit www.rms.com or follow us on Facebook, LinkedIn or @rmsjobs on Twitter.

RMS is proud to be an equal opportunity workplace and is an affirmative action employer. We are committed to equal employment opportunity without regard to race, color, creed, gender, religion, marital status, registered domestic partner status, age, national origin or ancestry, physical or mental disability, genetic characteristics, sexual orientation, or any other classification protected by applicable local, state, or federal law.

RMS is enrolled in E-Verify® and will be participating in E-Verify in addition to our Form I-9 process. www.dhs.gov/E-Verify.

To all recruitment agencies: RMS does not accept unsolicited agency resumes and will not responsible for the payment of placement fees related to unsolicited resumes submitted to open positions, job aliases, or to our employees.

Let your dream job find you.

Sign up to start matching with top companies. It’s fast and free.