Job Directory Sr. Site Reliability Engineer

Sr. Site Reliability Engineer
Boston, MA

Companies like
are looking for tech talent like you.

On Hired, employers apply to you with up-front salaries.
Sign up to start matching for free.

About

Job Description

Title: Sr. Site Reliability Engineer

Location: Boston, MA

Duration: 6 months+

Industry: Educational (Big 3)

Essential Accountabilities:

* Hands-on design, analysis and troubleshooting of highly-distributed large-scale production systems;
* Ownership of reliability, uptime, capacity, and performance analysis thereofEnsuring the repeatability, traceability, and transparency of our infrastructure automation
* Identifying highest-impact opportunities to optimize existing systems
* System design consulting for teams seeking to leverage or improve their production infrastructure
* Anticipate, build and plan capacity for upcoming product/feature launches

Required Skills:

* Mastery of AWS services (IAM, EC2, S3, EBS/EFS, ELB/ALB, AutoScaling, RDS and replication techniques, VPC, Subnets, Elastic IP, Route53, CloudWatch, CloudFront, Lambda, CloudFormation, ECS, SNS, ElastiCache)
* Expertise in container/container-fleet-orchestration technologies (like Docker,Kubernetes, AWS ECS)
* Expertise in designing and manage escalation response plans from monitoring, react,respond, remediate and retrospect in culturally aligned (proactive, customer focused,collaborative, data-driven and AUTOMATED) ways
* Mastery of infrastructure build and configuration automation technologies (like Terraform, Ansible, Puppet, CodeDeploy, Chef)
* Strong skills in reading, understanding and writing code in at least two of: Javascript, Python, PHP, Go, or Ruby
* Strong network engineering skills
* Cloud and container native Linux administration/build/management skills (AWS AMIs,Packer, etc.)
* Significant experience troubleshooting concurrent and distributed system interactions
* Expertise with continuous-deployment software development lifecycles in the Cloud (e.g.CI/CD);
* Cloud database operations and deployment experience (RDS MySQL/Postgres/Aurora), caching operations & deployments (Memcache, Redis)
* Expertise with Lean/Agile deployment processes (ZDT: Blue/Green, Canary, DNS strategies)
* Familiarity with site and infrastructure monitoring systems (CloudWatch, Datadog, New Relic, Sumologic, Thousand Eyes)
* Strong problem solving, root cause analysis and systems engineering skills; Good presentation and communication skills
* Expertise with SDLC branching, SCM, and code deployment systems (Git/Gitflow,Jenkins, CircleCI, etc.)
* BS Degree in Computer Science (or related technical field and/or equivalent industry experience).

Let your dream job find you.

Sign up to start matching with top companies. It’s fast and free.