AppDynamics is an application performance monitoring solution that uses machine learning and artificial intelligence (AI) to provide real-time visibility and insight into IT environments. With our unique AIOps solution, you can take the right action at exactly the right time with automated anomaly detection, rapid root-cause analysis, and a unified view of your entire application ecosystem, including private and public clouds. Using AppDynamics, you'll finally align IT, DevOps, and the business around the information that helps you protect your bottom line and deliver flawless customer experiences at scale.
First and foremost, you have strong troubleshooting and problem resolution skills. You work well under pressure and have strong written and verbal communications skills. You pride yourself in being a self-starter who leads by example and has experience working in a rapidly changing environment. You also have:
* Minimum of a Bachelors degree in CSE, EE, CSM, or related technical discipline; MS degree desired
* Minimum of a combined 8+ years of Site Reliability, DevOps, and/or Software Development experience, ideally in a growth-stage environment
* Experience operating within, and supporting, complex SaaS production or revenue-critical 24/7 web services environments
* Must have experience developing and operationalizing system installations and upgrades
* Strong Experience with Unix/Linux system administration especially in RedHat Linux (CentOS)
* Experience in developing and administering Web Servers, App Servers, and DBs running J2EE applications
* Experience running and administering services in AWS or other cloud platforms (Azure, GCP)
* Significant experience with one or more scripting/coding languages, ideally Ansible, Chef, Puppet, Terraform, or Python
* Experience with big data platform engineering; NoSQL or HBase strongly preferred
* Experience with scaling and operationalizing distributed data stores, file systems, and services (Kafka, Elasticsearch, etc); familiarity with Lamdba architecture a big plus
* Experience with virtualization and containerization platforms (Docker, OpenStack) and container orchestration tools (Kubernetes)
* Availability for occasional on-call after-hours support
About the Role
We are looking for a Senior DevOps Engineer to join our SaaS Operations Engineering team. This is a unique opportunity to work in a rapidly scaling environment, learn new technologies, and tackle mission-critical technical challenges at massive scale (our platform is processing roughly 200M metrics per minute and growing). As a Senior Engineer, you will be responsible for helping lead several new initiatives to enhance and scale our data ingestion pipeline and SaaS / on-premise applications. You will collaborate with software development on new applications that expand our core offering, providing deep expertise to help steer scalability and stability improvements early in the lifecycle of development.
You will be responsible for developing and supporting full-stack platform infrastructure initiatives in a complex distributed environment. The result will be recommendations and hands-on development of enhancements that will produce greater stability, scalability, and throughput for our platform. You will provide input to Development and Operations team members when new architectures, designs, and/or operational models are being formulated. You will also be responsible for providing technical expertise to other teams as you build specific expertise in the deployment and operational best practices specific to our data platform.
Day-to-day responsibilities include:
* Iterative analysis, modeling, testing, profiling of various components of the SaaS application
* Documenting findings and recommendations for improvement
* Responsible helping lead full-stack platform infrastructure projects
* Maintaining and enhancing deployment scripts, tools, and methodologies; play a lead role in advancing our 'Infrastructure as code' architecture.
* Lead the evaluation and development of our data ingestion pipeline to be deployed 'as a service' or on premise on commodity hardware.
* Ensuring efficient and scalable artifact deployments to production servers using automation scripts and other deployment tools
* Making recommendations to, and interfacing with engineering to ensure 100% application uptime
* Monitor the SaaS environment and work with QA, Developers, Ops to identify and solve problems
* Ensure that failover mechanisms are in place and are working correctly
* Responding to and resolving technical emergencies
We know that the award-winning culture at AppDynamics is something to brag about, but here are more reasons that make you excited to get out of bed to come in the morning, like:
* Medical, dental, vision coverage
* 401k match (4.5%)
* Wellness perks (hobbies, education, store discounts, personal finance)
* 4 weeks PTO, 5 days VTO, 14 holidays (including 1 birthday PTO and 1 floating holiday)
* Mandatory company shut down between Christmas and New Years
* Weekly catered breakfast and lunch, and all the snacks, fruits and drinks
* Brand new state of the art office in downtown San Francisco, centrally located near BART, Caltrain, Muni, the ferry, and a bike share station
* Free shuttle service and pre-tax commuter benefits
Just a note
Note to Recruiters and Placement Agencies: AppDynamics does not accept unsolicited agency resumes. Please do not forward unsolicited agency resumes to our website or to any AppDynamics employee. AppDynamics will not pay fees to any third party agency or firm and will not be responsible for any agency fees associated with unsolicited resumes. Unsolicited resumes received will be considered property of AppDynamics.
AppDynamics is an equal opportunity employer and considers all qualified applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, age, protected veteran status, or any other unlawful factor. AppDynamics complies with all applicable laws, including those regarding consideration of qualified applicants with criminal histories (such as the San Francisco Fair Chance Ordinance). If your disability makes it difficult for you to use this site, please contact firstname.lastname@example.org. AppDynamics participates in E-Verify.
AppDynamics develops application performance management solutions that deliver problem resolution for highly distributed applications.