As part of Infrastructure, the Cloud Infrastructure team at Airbnb is responsible for providing the core services and systems that allow product engineers to efficiently build and operate scalable and reliable applications. The team builds and maintains a bunch of services on top of AWS hardware and software for the whole company, such as databases, caches, streaming and queueing systems, and monitoring tools. The team also provides architecture design reviews, in-depth performance profiling, and builds accessible tools for engineers to identify the bottlenecks and interactions within various systems. Recently, the team is experimenting Kubernetes in order to achieve better scalability and resilience.
Two forever challenges for Airbnb infrastructure and the team are:
* Scalability: As a high-growth company, we have seen huge growth of traffic in the past years. For example, our Kafka traffic grows by 6x in the past year. In order to support such growth, Cloud Infrastructure team needs to continuously improve existing systems and explore next-generation infrastructure. To take database as an example, we went through several phases to handle data growth: upgrade software and hardware, split database, migrate to AWS Aurora, develop Airbnb-internal data access layer between applications and storage engines. * Stability: System stability as well as availability are among the most important metrics for the company. In order to achieve high stability, the team takes multiple efforts in parallel: (1) Each service/system improves its own stability; (2) Introspect end-to-end workflow to identify fragile components and fix them; (3) Build data backup and recovery mechanisms.
What are examples of work that Cloud Infrastructure engineers have done at Airbnb?
Here are some examples of work from Cloud infrastructure team. Note that some of the works are on top of industry-popular open source projects or AWS services.
* Build and maintain scalable and reliable databases/caches services to serve the whole company. * Build multiple queueing systems (such as Kafka, Kinesis, AWS SQS and Resque) to serve various business need. We scale our Kafka clusters to handle 6x traffic growth in a year. * Introduce Kubernetes to the company and grow Kubernetes clusters by 10x in the last year. We are preparing the clusters to onboard hundreds of services. * Develop disaster recovery mechanism to protect Airbnb data in case of disasters and emergencies. * Build Java and Ruby profiling tools to allow engineers to efficiently identify system bottlenecks. Also build monitoring and introspection tools to allow engineers to quickly introspect a system. * Build service mesh architecture to allow traffic routing and load balancing among hundreds of services. * Define SOA architecture for all Airbnb products and help them move towards the goal.
Cloud Infrastructure at Airbnb online:
* Airbnb Engineering Blog * Building Services at Airbnb: Part 1 and Part 2 * Capturing Data Evolution in a Service-Oriented Architecture * Nebula as a Storage Platform to Build Airbnb's Search Backends * Tech talk on migrating to Aurora at AWS re:Invent 2017 * Github * SpinalTap: a general-purpose reliable Change Data Capture (CDC) service * Sparsam: fast thrift bindings for Ruby * Synapse and Nerve: a service discovery framework
What experience is relevant to us?
* BS/MS/PhD in Computer Science or a related field * Exceptional proficiency using Java. Experience with Ruby/Ruby on Rails is a plus. * Proactiveness, good communication and fast learning, highly independent and autonomous. * Ability to work in areas outside of your usual comfort zone and show motivation for personal growth.
* Experience with Kubernetes and Docker * Design or operation of robust distributed systems * Experience with AWS cloud infrastructure
What benefits do we have?
* Stock * Competitive salaries * Quarterly employee travel coupon * Paid time off * Medical, dental, & vision insurance * Life insurance and disability benefits * Fitness discounts * 401K * Flexible Spending Accounts * Apple equipment * Commuter subsidies * Community involvement (4 paid hours per month for community service) * Company-sponsored tech talks and happy hours * Much more...
Airbnb is a company that provides an online marketplace and hospitality services.