Pinterest is looking for an experienced site reliability engineer to build and run our large-scale distributed systems. As an SRE on the Data & Storage team, you will design, build and monitor our applications and infrastructure that handle billions of monthly page views and petabytes of data as Pinterest continues to grow and scale.
What You'll Do:
* Design, build, and operate across a large-scale data and storage technology stack * Develop software solutions to enable operability of large scale distributed systems handling petabytes of data * Manage capacity and performance to help scale our infrastructure both on public and private clouds around the world
What We're Looking For:
* Strong knowledge of Linux/Unix/BSD internals and experience working with open source software (e.g. MySQL, Hadoop, Envoy, HAProxy, Nginx) * Experience with technologies such as ElasticSearch, ZooKeeper, HBase, Hadoop, Memcache and Kafka with a focus on reliability, automation, operability and performance * 2+ years of experience with programming languages (Python, Golang, Ruby, etc.) * Infrastructure as code a plus (e.g. Terraform, Puppet, Chef, Ansible, Salt, Fabric, Docker, etc) * Bonus points if experienced with deploying web apps to cloud infrastructure (AWS, etc.) and working with distributed, service-oriented architecture
Pinterest is a visual bookmarking tool for saving and discovering creative ideas.