Airtable is a mission-critical system for a diverse set of teams and industries. We build our platform to scale, stay resilient, and deliver delightful user experience all around. Our infrastructure requires thorough thinking, deep research into how things work, and rigorous coding. Our mission is ambitious, but we like to keep our infrastructure simple.
As one of the first dedicated site reliability engineers at Airtable, you will play a critical role in scaling and refining our operational practices. Site reliability engineering begins with building solid automation across the software delivery process, including configuration, provisioning, testing, deployment, and beyond. SREs will also work with software engineers to help understand the way their code behaves in production, and build nontrivial internal tooling to enable this. We also strive for a strong security posture, and SRE will help define and implement operational practices that protect our users. Lastly, of course, our operations team is the last line of defense when incidents happen, and SREs will be part of the team that responds to them.
What you'll do
* Automate everything: deploys, rollbacks, database provisioning, failovers, and everything in between.
* Design and implement monitoring tooling across the stack, and optimize systems for uptime, performance, and reliability based on the data gathered by this tooling.
* Design and write tests that investigate how our infrastructure handles failure and scaling.
* Research hot-off-the-press CVEs and implement best practices.
* Build occasional product features as appropriate.
* Manage our Elasticsearch cluster
Who you are
* You're painfully thorough, whether it's scripting bulletproof deployment automation, writing a recovery playbook that an engineer can follow without fail at 3 a.m., or digging into logs and monitoring data to find the root of a problem.
* You're OK carrying a pager and take it seriously, but you take pride when the pager hasn't rung in the past week.
* You've worked with Linux, containers/namespaces, and system automation tools for Unix and cloud platforms.
* You have 5+ years of relevant technical experience, including significant experience with site reliability/devops or server infrastructure engineering.
What we offer
* Health care: we have you 100% covered (and your dependents 50% covered) with competitive medical, dental, and vision insurance. You'll also be eligible for a complimentary membership to One Medical Group
* Learning & Development: we offer a $2,000 per year stipend for your personal career development
* Gym Membership: we're proud to provide employees in our San Francisco and New York offices with complimentary gym memberships to Equinox, or up to $100/month reimbursement towards any other gym
* Catered lunches: we have high-quality catered lunches every day and well-stocked kitchens. We'll also reimburse you for any reasonable food expenses incurred while working
* Generous PTO, sick leave, and parental leave
Airtable's mission is to democratize software creation, similar to the way the Macintosh democratized personal computing. Software is arguably the most important creative medium of the last century, yet most people cannot build their own software. Airtable gives people and companies a "lego kit" they can use to create custom applications on their own, regardless of technical experience.
We've raised $170M in venture funding, including most recently a 100M Series C from Benchmark, Thrive, and Coatue.
Airtable is a provider of a collaboration platform designed to organize and manage data.