Job Directory Cockroach Labs Site Reliability Engineer
Cockroach Labs

Site Reliability Engineer Cockroach Labs
New York, NY

Scalable, Available & Transactional for Open Source

Companies like Cockroach Labs
are looking for tech talent like you.

On Hired, employers apply to you with up-front salaries.
Sign up to start matching for free.

About Cockroach Labs

Job Description

Databases are the beating heart of every business in the world.

Cockroach Labs is the team behind CockroachDB, an open source, distributed SQL database. We aim to build infrastructure that keeps pace with the world, so developers can focus on what matters most: building the best products. Join us on our mission to Make Data Easy. Are you ready to aim high and build to last?

About the Role

CockroachDB provides the backbone of storing data on a global scale. The Site Reliability Engineer is responsible for managing the infrastructure for our cloud service offering. You will oversee our production system, ensuring that our services span several cloud providers as part of our hosted offering. You will also spend roughly half of your time doing greenfield development work, with an emphasis on tool development and driving automation.

You Will

* Manage the infrastructure for cloud services, including running internal production systems and hosting CockroachDB for our external customers.
* Design, write and deliver software and systems to increase product reliability and organizational efficiency.
* Develop custom tools as necessary.
* Keep a complex system running and solve problems relating to mission-critical services.
* Design, implement, operate, and troubleshoot the automation and monitoring of production clusters to maximize performance and availability.
* Drive the company through disaster recovery tests, where we manually turn down pieces of CockroachDB to test it's overall resilience to failures.
* Participate in a weekly on-call rotation for our production systems and hosted services.

The Expectations

In your first 30 days, you will take over the operation of our existing internal and customer-facing production systems. Working with product and engineering, you will assess our production operations and build out runbooks for the operation of different systems. We believe that it's essential for you to take this first month to become familiar with our technology and our company.

After 3 months, you'll be fully integrated into the team. You will take full ownership for reliability, automation, and other issues related to CockroachDB's stability. You will identify new opportunities for automating processes, streamlining delivery, deploying new core functionality, and building great tools. You will help make CockroachDB more friendly by bringing your expertise to our database.

You Have

* Expertise in analyzing, monitoring, and troubleshooting large-scale distributed systems.
* Experience in software development using one or more of the following Go, C, C++, Python, Java
* Proficiency working with algorithms, data structures, and production troubleshooting.
* Expertise in working with major cloud providers like AWS, Azure, GCP, etc. and Cloud APIs.
* Debugged and optimized code and to automate routine tasks.
* Working knowledge of web and network protocols and standards (HTTP, TLS, DNS, etc.)
* Previous on-call experience, with a sense of urgency.

The Team

You will have the opportunity to report to a member of our engineering leadership team based on the project you work on at Cockroach Labs.

Peter Mattis - Co-founder & Chief Technology Officer

Peter works on a bit of everything, from low-level optimization of code to refining the overall design. He was thrust into file systems early in his career at Inktomi and then learned the true meaning of scalability while working on Gmail and Colossus at Google. Before stepping into the office in the morning, he will have nursed his CrossFit addiction and dealt with the chaos of a three kid morning routine. You can set your watch by his daily departure at 4:30 pm to have dinner with his family.

Reporting to Kendra Curtis - Engineering Manager

Kendra has 20+years of experience in all levels of the software stack, from early days writing firmware for wireless networking products at Wi-LAN, to managing teams of developers building Web Applications at Google. Kendra worked on Google's early data centers. She was a member of the Management team responsible for the integration of DoubleClick (DCLK) into Google. She was a Co-founder and CEO of Scout It Out, a listing service for Rehearsal Spaces. Kendra joined Cockroach Labs because she loves building great teams and values work-life balance. Outside of work she enjoys skiing, acting, and walking in the park with her dog, Lady.

Our Benefits

* 100% health insurance coverage (for you and your dependents!)
* Paid parental leave (with baby bucks)
* Flex Fridays
* Flexible time off & flexible hours
* Education reimbursement
* Relocation support

Cockroach Labs is proud to be an Equal Opportunity Employer building a diverse and inclusive workforce. If you need additional accommodations to feel comfortable during your interview process, please email us at accessibility@cockroachlabs.com.

About Cockroach Labs

Scalable, Available & Transactional for Open Source

Headquarters
Size
86 employees
Cockroach Labs

110 5th Ave 5th Floor

Let your dream job find you.

Sign up to start matching with top companies. It’s fast and free.