Job Directory Microsoft Principal Site Reliability Engineer Manager

Principal Site Reliability Engineer Manager Microsoft
Redmond, WA

Microsoft is a technology company that develops and supports software, services, and devices.

Companies like Microsoft
are looking for tech talent like you.

On Hired, employers apply to you with up-front salaries.
Sign up to start matching for free.

About Microsoft

Job Description

Microsoft has been a leading company in computing for decades. We are a global company, relied on by companies, governments, utilities, stores, schools, universities and co-operatives to deliver the things they need to work, every day.

In order to make this work, we need to make it reliable. In order to make it reliable, we need you -- someone who already is, or is interested in becoming, a Site Reliability Engineering Manager (also known as SRM).

Site Reliability Engineering is a hybrid role, comparatively rare in industry but crucially important to how things work behind the scenes today.

SREs are people who take engineering-based approaches to solving operations problems; we like infrastructure, we like seeing how the big complicated thing works, and most importantly, we gain great satisfaction from making it better. We have backgrounds in lots of things -- yes of course, Computer Science, System Administration, Networking, Mathematics, and Engineering generally, but you can also find folks who've worked in Physics, Chemistry, Computational Biology, Statistics, and even English.

Site Reliability Engineers build, monitor, and maintain the systems and infrastructure that ensure our customers can quickly access their data and run workloads whenever they need to. We identify service problems and areas for improvement, and we help implement solutions. Our work is key to the success of many of the Microsoft services you'll have heard of, and a number you haven't. There are very few bits of Microsoft which aren't touched by SREs in some way or other.

Site Reliability Engineering Managers (often called Site Reliability Managers, or SRMs) manage teams of people to achieve the goals above. Unlike many similar roles in industry, this is a role where being technical is expected of you. Of course, you're not deep in the systems or the code every moment of the day, but in a real sense, you are both the backstop and heart of the team -- capable of doing whatever is required to move things forward at any moment, whether that is reassuring a team member of their potential, bringing a crucial piece of distributed systems experience to a design decision, or writing the roadmap for the team's next year using only the finest electronic crystal ball. It is a demanding role, but in partial compensation, the satisfaction of working with great people at the cutting edge of computer science on problems no-one else has seen before is tremendous.

If you love helping your team be successful, if you enjoy communicating well, and if engineering problems that few other organizations in the world have seen would appeal to you, we'd like to talk.


* 7+ years of software development: automation-related experience valued in particular.
* 7+ years of experience using scripting languages such as bash, python, and PowerShell, or compiled languages such as C, C# and Go are most relevant but others are acceptable.
* 10+ years of experience leading people in a manager or lead capacity.

We would like to talk to you if you:

* Are interested in distributed systems and working with high scale services.
* Like to work in an fast-moving environment that isn't afraid to change things to make them better.
* Enjoy new technological challenges and enjoys fixing the problems that result.
* Believe that a team working well together is truly smarter than the single smartest person on that team.
* Aspire to be grow as a person, as a teammate, and as an engineer.

The PIE organization was formed in late summer of 2017 and is a swiftly growing organization. PIE's Vision is to make it easy for everyone to create, consume, and manage planetary-scale, reliable cloud production services and infrastructure to achieve more. As a team, we bring together significant and complementary capabilities with tooling, infrastructure, monitoring and insights in new ways to increase our perspective. Our diversity of knowledge and experience comes together for the benefit of our users, our colleagues, our business, and ourselves.


Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form.

Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.

* Lead a team of engineers, generally by example; look after them and deliver credibility and an identity for your team.
* Partake in design, implementation, commissioning and tracking of projects to incrementally improve or radically change where appropriate. Maintain technical state on associated products and services.
* Own the product in production; care about performance, availability, and efficiency of productions & services within the team's purview. Track those and be accountable for them.
* Focus on automation (and autonomous action) as a first-order response to system failures and misfeatures. Regard software and services engineering as being powerful tools to bring to bear on larger problems.
* Manage incident response & on-call responsibilities with partner teams and across a variety of services.

About Microsoft

Microsoft is a technology company that develops and supports software, services, and devices.

10001 employees

1 microsoft way

Let your dream job find you.

Sign up to start matching with top companies. It’s fast and free.