Software Engineer - Hadoop
San Francisco, CA
Who we are:
Twitter is seeking engineers to help build and grow our data storage and processing systems. Our Hadoop infrastructure is one of the industry's largest-scale deployments with over 50K machines that store and process hundreds of petabytes of data. Our team has a track record of active contributions to the Apache Hadoop code base. All Twitter features and products use our Hadoop infrastructure one way or the other.
What you'll do:
As a Hadoop Engineer, you will build systems that serve tens of thousands of jobs and queries per day. The services you build will integrate directly with Twitter's products, opening the door to new and cutting-edge features. You will empower dozens of engineering teams, hundreds of co-workers, and millions of users to gain new insights and dream of new possibilities.
The Hadoop team is hiring in the following areas:
* Distributed storage and compute infrastructure
* Event log ingestion, consolidation, and replication
* Automated self-service cross cluster & cloud data replication
* Hybrid on-premise and in-cloud Hadoop infrastructure
* Near real-time Hadoop usage analysis and chargeback tooling
Who You Are:
You are passionate about cutting-edge open-source technologies. You enjoy working with top notch engineers to build new features and take on complex problems. You have experience in distributed systems, database internals, networking fundamentals, or performance analysis. You like to present our technical challenges and your accomplishments at conferences. You want to work in a team with an enormous impact on Twitter's business and the world at large.
* Work with and contribute to the Apache Hadoop and related open source communities to build new features and fix issues in support of Twitter's usage and growth.
* Design and build a hybrid cloud solution combining on-premise and in-cloud clusters.
* Diagnose and troubleshoot complex distributed systems problems and develop solutions with a significant impact at our massive scale.
* Build tools to ingest and process more than 1 trillion messages per day.
* Design and develop next-gen storage and compute platforms used by dozens of engineering teams in our product, revenue and data science organizations.
* Communicate with a wide set of teams, including Hardware, Network, Linux kernel, JVM, SiteOps teams, and cloud vendors.
* Build advanced tooling for testing, monitoring, administration, and operations of multiple clusters across data centers, on-premise and in-cloud clusters.
* Hands-on experience in distributed systems.
* Strong software development skills in at least one of: Java, C/C++, or Scala.
* BS, MS or PhD degree in Computer Science or Engineering, or equivalent experience.
* Experience building and supporting large-scale systems in a production environment.
* Running large infrastructure services on cloud.
* Experience with OSS technologies such as HDFS, YARN, Tez, Flume, HBase, Spark, Zookeeper.
* Solid knowledge of networking and Linux systems management.
* Experience with operating system internals, file systems, disk/storage technologies and storage protocols.
We are committed to an inclusive and diverse Twitter. Twitter is an equal opportunity employer. We do not discriminate based on race, ethnicity, color, ancestry, national origin, religion, sex, sexual orientation, gender identity, age, disability, veteran status, genetic information, marital status or any other legally protected status.
San Francisco applicants: Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.
Twitter is a social networking platform.