Job Directory Datadog Software Engineer - Data Reliability

Software Engineer - Data Reliability Datadog
New York, NY

Datadog is a company developing a monitoring and analytics platform for developers and IT operations teams.

Companies like Datadog
are looking for tech talent like you.

On Hired, employers apply to you with up-front salaries.
Sign up to start matching for free.

About Datadog

Job Description

The company:

We're on a mission to build the best platform in the world for engineers to understand and scale their systems, applications, and teams. We operate at high scale-trillions of data points per day-providing always-on alerting, metrics visualization, logs, and application tracing for tens of thousands of companies. Our engineering culture values pragmatism, honesty, and simplicity to solve hard problems the right way.

The team:

How do you provide data to a real-time service that monitors hundreds of thousands of servers 24 hours a day?

How do you ensure data correctness in the face of infrastructure failures and network partitions, in a high-volume, low-latency environment?

What should the infrastructure look like when data may double in size or in throughput on short notice? If you think you have the answers, join us as a Data Reliability Engineer (DRE) and help us ensure that we're providing reliable data, at scale, and quickly.

You will:

* Keep our datastores reliable, available and fast.
* Respond to, investigate and fix issues, whether it's deep in the Database code or in the client application.
* Build tooling to minimize customer-facing downtime, and scale up resources on short notice
* Protect and ensure the consistency of customer data.
* Work with developers to design data models, and choose the correct datastores, to support orders of magnitude more customer data and traffic.


* You have a BS/MS/PhD in a scientific field or equivalent experience
* You have a track record as an engineer in the operations of a large site
* You value correctness and efficiency; you leave no stone unturned when diagnosing production issues
* You handle infrastructure with code because automation lets you focus on the more difficult and rewarding problems
* You have production experience with distributed datastores, e.g. zookeeper, cassandra, postgres, kafka, elasticsearch, redis

Bonus Points

* You have created tooling for, or submitted contributions to, an open-source data store
* You are fully fluent in python, ruby or go

About Datadog

Datadog is a company developing a monitoring and analytics platform for developers and IT operations teams.

5000 employees

620 8th avenue

Let your dream job find you.

Sign up to start matching with top companies. It’s fast and free.