Ultimate Software is seeking a Senior Site Reliability Engineer (SRE) with a robust and diverse background in Software Engineering, Software Design and Systems Architecture with a focus on automation, reliability, and system integration. Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. An SRE ensures that Ultimate Software's services-both our internal and external systems-are reliable with uptime appropriate to users' needs while keeping an ever-watchful eye on capacity and performance.
At Ultimate Software our Site Reliability Engineer (SRE) come from both development and operations backgrounds with a common passion for running products at scale in production. Our SRE engineers are always seeking to understand how our systems work end to end without boundaries.
Our team is responsible for:
* Performance, Stability, and Reliability considerations
* Capacity planning
* Working closely with the product development teams to build and design features
* Debugging issues in production
* Building out CI / CD pipelines
* Building out logging, monitoring and alerting infrastructure
Here at Ultimate Software, we truly put our people first. We strongly believe in teamwork, and we encourage and trust our people to reach higher, learn more, and live up to their potential. Ultimate is ranked #1 on Fortune's Best Places to Work in Technology for 2019 and #8 on the 100 Best Companies to Work For list in 2019. Ultimate is also ranked #1 on Fortune's 75 Best Workplaces for Women and #5 on its Best Workplaces for Diversity list.
Primary/Essential Duties and Key Responsibilities:
* Engage in and improve the whole lifecycle of services from conception to inception, including: system design, build, and deployment
* Define and implement standards and best practices related to: System Architecture, Deployment, metrics, operational tasks
* Support services through activities such as monitoring availability, system health, and incident response
* Improve system performance, application delivery and efficiency through automation, process refinement, post mortem reviews, and in-depth configuration analysis
* Engage in Communications across all areas of the organization
* Experience with algorithms, data structures, complexity analysis and software design.
* Experience with highly resilient systems as well as anti-fragility design patterns
* Experience with distributed systems
* Experience with service-oriented architectures
* Experience in one or more of the following: Python, Go, Perl, C, C++, Java or Ruby
* Experience with Unix/Linux operating systems internals and administration (e.g., filesystems, inodes, system calls) and networking (e.g., TCP/IP, routing, network topologies).
* Experience with Amazon Web Services and Google Cloud Platform Products
* Ability to multitask and adapt quickly to changing priorities
* Ability and willingness to work evenings / nights on occasion (Participate in on-call rotation)
* Experience with Configuration Management (Puppet/Chef/Ansible)
* Experience with Linux command-line shell and shell scripting
* Ability to lead and work in projects
* Ability to communicate effectively (listening, presenting and questioning)
* Positive team participation skills
* Strong organizational, written and communication skills
* BS degree in Computer Science, or a related technical field involving coding (e.g. physics or mathematics), or equivalent practical experience preferred.
* Experience with OpenStack
* Experience with administrating ElasticSearch, Mysql, Mongo, Rabbitmq, Redis, in production environment a PLUS
* Exposure to writing SQL scripts preferred
* Experience with Kubernetes, bosh, docker, mesosphere a plus
* Technical writing
* Development Background
* Limited Travel upon request (less than 5%)
This job description has been written to provide an accurate reflection of the current job and to include the general nature of work performed. It is not designed to contain a comprehensive detailed inventory of all duties, responsibilities, and qualifications required of the employees assigned to the job. Management reserves the right to revise the job or require that other or different tasks be performed when circumstances change.
Ultimate Software will reasonably accommodate employees with disabilities as defined by the Rehabilitation Act of 1973, the Americans with Disabilities Act (ADA) and other appropriate statutes.