Amazon is building some of the largest distributed systems in the world, and we need smart people to support and engineer the next generation of compute and storage platforms. Amazon's Data Center Operations Strategic Engineering (DCOSE) group provides data center support worldwide with focus on continuous improvement. We have high standards for our infrastructure as well as our employees, and our systems are highly reliable, highly available, and turn scale into an advantage for our business and an asset to our customers. Our employees are super smart, driven to serve customers, and fun to work with.
Data Center Operations DevOps Engineering (DCODE) is a team of crafty engineers within DCOSE that focuses on continuous improvement through infrastructure automation and tool development. You should be an engineer who is focused on developing internal systems written in Python, Java, C or a similar language. You also have a comfortable understanding of Linux, networking and cloud services. As always, at Amazon you should have a drive to create the best customer experience possible, with the customers being both AWS customers and the other engineers at AWS who will benefit from the automation you create. You are data driven to dive deep, analyzing data for trends and systemic issues, then follow our Software Development Life Cycle (SDLC) to develop solutions or effect changes to eliminate problems from our environment. You enjoy developing front and back-end applications and dashboards that would enable the business to make informed decisions. Your creativity and understanding of the business needs will drive agile development to keep up with ever-changing customer demand. You will also support the underlying infrastructure that hosts our applications through Availability, Performance and Capacity Management. You will work directly with the various service owners and hardware design teams to collaborate on hardware issues within the fleet. You think proactively and work to prevent support issues before they are realized. You work with other Amazon leaders to share ideas and improve support within the company. You take a role in the strategic direction of the team. You play a significant role in hiring, mentoring, and training employees. You demonstrate excellent judgment when making decisions.
* Develop or further existing applications, system management tools, and processes that reduce manual efforts and increase overall efficiency.
* Adapt and improve operations management systems and processes to accommodate rapid and increasing growth in systems and traffic.
* Monitor the health of the fleet, automating system health, maintenance tasks, and reporting systems as needed.
* Perform various system maintenance tasks, including deployments and improving availability and performance of tools.
* Assist in developing methods for incident reduction.
* Monitor various data sources for unidentified fleet issues.
* Manage directly assigned tasks and on-call duties gracefully.
* Collaborate with outside teams to resolve customers issues.
Amazon is a company operating a marketplace for consumers, sellers, and content creators.