Job Directory Sr Site Reliability Engineer

Sr Site Reliability Engineer
New York, NY

Companies like
are looking for tech talent like you.

On Hired, employers apply to you with up-front salaries.
Sign up to start matching for free.

About

Job Description

Job Description

Developer Productivity Engineering is a distributed team that owns internal tools used to deploy the services that make up Disney Streaming Service's products. Built for AWS with a variety of open source software, our tools are used by dozens of engineering teams across the company. We strive to act as a productivity multiplier by offering our customers rich primitives for delivering their services, allowing them to focus more on product.

System Reliability Engineers fulfill a cross-functional role by driving the delivery of services through to production. Within Developer Productivity, you will help design and operate services to support exponential growth in Disney+ and ESPN+. You'll also collaborate with other engineers to pave way for the future of infrastructure in AWS, moving beyond traditional practices. You should have a passion for systems engineering, monitoring & observability, and automation.

This position can be worked remotely, or from our locations in NYC and SF.

Responsibilities

* Maintain, and improve, the reliability and operability of services
* Design systems to enable rapid development, high availability, and clear observability
* Write tools, and leverage open source, to automate tasks with an emphasis on safety and repeatability
* Troubleshoot and resolve performance and reliability issues across the stack, including cloud resources
* Collaborate with engineers to ensure services are designed to be cloud-native, scalable, and easily operated

Requirements

* BS or MS degree in Computer Science, or equivalent experience
* 3+ years experience writing software on, or operating, *nix platforms
* You're a self-learner, independent, and have excellent problem-solving skills
* You care deeply about code craftsmanship and operational excellence
* You have strong written and verbal communication skills

Nice to have, but not required

* Experience with software containers (e.g. Docker, rkt, runC) and schedulers (e.g. ECS, Kubernetes, Nomad)
* You've directly impacted the reliability and availability of large-scale distributed systems
* Deep understanding of networking, especially routing and the IP stack
* You've deployed and operated geographically distributed, redundant services
* Engagement with open source communities

Technologies we love

* Languages: Go, Ruby, Bash
* Tools: Ansible, Docker, Git, Graphite, GraphQL, Jenkins, Logstash, Packer, Sensu
* Data stores: DynamoDB, Elasticsearch, PostgreSQL, Redis

Job Type

Full Time

Segment

Direct-to-Consumer and International

Category

Technology

Business

Disney Streaming Services

Postal Code

10011

Job Description

Developer Productivity Engineering is a distributed team that owns internal tools used to deploy the services that make up Disney Streaming Service's products. Built for AWS with a variety of open source software, our tools are used by dozens of engineering teams across the company. We strive to act as a productivity multiplier by offering our customers rich primitives for delivering their services, allowing them to focus more on product.

System Reliability Engineers fulfill a cross-functional role by driving the delivery of services through to production. Within Developer Productivity, you will help design and operate services to support exponential growth in Disney+ and ESPN+. You'll also collaborate with other engineers to pave way for the future of infrastructure in AWS, moving beyond traditional practices. You should have a passion for systems engineering, monitoring & observability, and automation.

This position can be worked remotely, or from our locations in NYC and SF.

Responsibilities

* Maintain, and improve, the reliability and operability of services
* Design systems to enable rapid development, high availability, and clear observability
* Write tools, and leverage open source, to automate tasks with an emphasis on safety and repeatability
* Troubleshoot and resolve performance and reliability issues across the stack, including cloud resources
* Collaborate with engineers to ensure services are designed to be cloud-native, scalable, and easily operated

Requirements

* BS or MS degree in Computer Science, or equivalent experience
* 3+ years experience writing software on, or operating, *nix platforms
* You're a self-learner, independent, and have excellent problem-solving skills
* You care deeply about code craftsmanship and operational excellence
* You have strong written and verbal communication skills

Nice to have, but not required

* Experience with software containers (e.g. Docker, rkt, runC) and schedulers (e.g. ECS, Kubernetes, Nomad)
* You've directly impacted the reliability and availability of large-scale distributed systems
* Deep understanding of networking, especially routing and the IP stack
* You've deployed and operated geographically distributed, redundant services
* Engagement with open source communities

Technologies we love

* Languages: Go, Ruby, Bash
* Tools: Ansible, Docker, Git, Graphite, GraphQL, Jenkins, Logstash, Packer, Sensu
* Data stores: DynamoDB, Elasticsearch, PostgreSQL, Redis

Job Directory Sr Site Reliability Engineer

Sr Site Reliability Engineer
New York, NY

Companies like
are looking for tech talent like you.

About

Job Description

Let your dream job find you.

Jobseekers

Employers

Resources

Company

Job Directory Sr Site Reliability Engineer

Sr Site Reliability Engineer New York, NY

Companies like are looking for tech talent like you.

About

Job Description

Let your dream job find you.

Jobseekers

Employers

Resources

Company

Sr Site Reliability Engineer
New York, NY

Companies like
are looking for tech talent like you.