Our organization enables one of Microsoft's most critical business objectives: earn the trust of our customers. We ensure the authenticity and integrity of Microsoft products and services. We are the product release service provider for all of Microsoft's businesses - Azure, Office 365, XBOX, Windows, the Windows App Store, to name a few. Whatever business Microsoft is in, trust is a critical part of it.
Our Infrastructure team is responsible for designing and running infrastructure On-prem and Azure environments and provides all services necessary to deploy and handoff application ready physical and virtual servers. Specifically, infrastructure - Compute Service Engineering team designs, deploys, monitors, and continuously improves the reliability of our Compute platform. We simplify processes and solutions, create automation, and increase reliability and availability of Compute platform.
We are looking for a Senior Service Development and Operations (DevOps) Engineer to be a member of our Compute Service Engineering team. We offer opportunities to design, implement, and continuously improve service using new software and hardware technologies, and then automate the operation and reliability. The role involves working with cutting edge technology in virtualization, server management, security, and storage.
* 3+ years of software development and releasing experience with Powershell, C#, and /or other scripting languages
* 1+ years of Service design and/or implementation, improvement, and support experience
* Knowledge and demonstrated experience of the internals of Windows, and managing and administering different windows services
* Comprehensive knowledge of computing infrastructure technologies such as Hyper-V, Virtual Machine Manager
* Working knowledge of using Agile and Scrum methodologies.
* Solid experience with change, incident, problem management, Postmortem and RCA processes
* Ability to collaborate with people representing diverse points of view, interact with senior leaders to drive business impact
* Knowledge of networking, Active Directory authentication, name resolution and DHCP
* Experience in Performance Tuning using monitoring and troubleshooting tools
* Knowledge of storage architecture and offerings
* Familiarity working with architecture and security in deploying a highly secure and available private cloud solutions
* Experienced with Hardware Security Modules (HSMs)
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to, the following specialized security screenings:
* Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.
Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
* Design and implement service: Responsible for designing compute service using existing, new, and emerging first/third party software and hardware technologies. Collaborate with Network, Storage, Security, and Application Service Engineering to define the v-next Compute service offering based on new/emerging SW/HW technologies, service history, input from customers, and industry best practices.
* Reliability Engineer the Compute Service Platform: Responsible for improving the health and welfare of the Compute infrastructure service by defining improvement activities and metrics. Execute service health-and-welfare and improvement projects such as best practice configuration, OS/FW/Driver upgrade (stay current), and performance and risk mitigation.
* Monitor and Optimize: Responsible for creating monitoring reports focused on improving service quality, effective resource utilization, events and configuration deviations. Initiating projects to execute improvements and KPIs for ongoing improvement.
* Automation: Collaboration with others to design and develop automation for self-healing of issues, self-service for change requests, and software lifecycle management. Automate service improvement activities, deploy critical security and reliability updates, and configuration changes to our Compute environment.
* Livesite support: Participate in an on-call rotation and engage and help to resolve critical business service disruption Livesite Incidents, complete Postmortem, and own Livesite Reviews. Follow up and complete Repair items to prevent repeated incident.
Microsoft is a technology company that develops and supports software, services, and devices.