Job Directory Systems Engineer

Systems Engineer
Bellevue, WA

Companies like
are looking for tech talent like you.

On Hired, employers apply to you with up-front salaries.
Sign up to start matching for free.

About

Job Description

Agilysys delivers highly-available cloud services for the hospitality industry. We practice Agile methodologies, and our cross-functional teams build strong, collaborative relationships as partners in the delivery of quality solutions. As a member of the Agilysys Platform SRE team, you are responsible for operating SaaS environments, pre-production services and the management of hosted services and VMs. Work closely with development teams and provide guidance integrating service operations workflow with the product development life-cycle. You have experience with systems administration at the command line in a SaaS production environment. You will continually improve the way we deliver software as a service, by automating infrastructure and operations workflows, continually assessing and improving service performance, and cultivating collaboration across the development and operations life-cycle.

Principal Responsibilities:

* Configuration Management: Maintain a well-defined pre-production systems configuration, and adhere to a disciplined processes for introducing change to pre-production and production systems.
* Infrastructure Automation: Write and improve automation scripts, playbooks, and roles, occasionally pushing the boundaries of what automation can do. Develop scripts with logical flows using concepts like conditionals, loops, and arrays.
* Instrumentation, Logging, and Monitoring: Evaluate and implement logging and monitoring solutions, including time-series and trending capabilities. Actively use the resulting information to maintain and continually improve service operations.
* Service Reliability and Availability:
* Data Protection: Evaluate and implement backup solutions, considering retention needs and rate of incremental and full backups.
* Disaster Recovery: Maintain failover environments and processes, to meet recovery point and recovery time objectives.
* Audit & Compliance: Maintain processes to meet industry-standard audit & compliance requirements. Experience working in a PCI-certified data center a plus.
* High Availability: Experience with load-balancing and proxying technologies. Experience planning, developing, and building solutions with high-availability needs. Familiar with managing clustered infrastructure services, using a variety of clustering concepts.


* Secure Operations
* Perform all work with security best practices and operational concerns in mind.
* Evaluate security patches and updates and determine implementation priorities.
* Familiar with operating File Integrity Management solutions, host and network Intrusion Detection Systems, malware scanning and monitoring.
* Certificate Management: Maintain certificate lifecycles for secure identity management and communications.


* Product Delivery Lifecycle
* Understand version control best practices and workflow using Git.
* Collaborate with multiple Product Engineering teams. Support developer environments. Assist Product Engineering with Linux system troubleshooting.
* Familiar with release management processes to deliver production software.


* Database Administration
* Manage database clusters, data nodes, and configuration nodes.
* Possess understanding of NoSQL databases and how they differ from relational databases.


* Service Operations:
* Request Management: Monitor the operations ticket queue and handle day-to-day service operations, such as log file analysis, alert responding, and user access management. Manage requests promptly to keep teams productive. Manage creation and configuration of new virtual machines.
* Incident Management: Respond to break/fix issues to restore service operation with high urgency. Disciplined at problem solving in a high-availability production environment.
* Work with and troubleshoot containers running in a production environment. Ideally have evaluated, configured, or managed a container clustering environment
* On-Call: Participate in on-call rotation duties to maintain operational coverage round-the-clock.



Education and Experience:

* Bachelor's degree with a major course of study in Information Technology, or equivalent experience.
* 5+ years overall development/technical operations experience.
* 3+ years of experience in a staged continuous integration system.
* 3+ years of recent experience practicing DevOps in a production cloud/automated environment.

Technical Skills:

* Operating Systems: RedHat based distributions with a preference towards CentOS 6/7. Knowledge of disk management and partitioning, using LVM. Linux system troubleshooting. Familiar enough with Windows Server to secure the OS, gather logs, and use infrastructure automation to manage.
* Infrastructure Automation: Familiar with working with Infrastructure as a Service: using and updating templates, automation of server spin-up/spin-down, monitoring, and using other IaaS tools Terraform, Ansible (preferred), Chef, Puppet, SaltStack or other automation tools.
* Scripting knowledge (Bash, Python, PowerShell).
* Systems Configuration management: familiar with systemd and linux system tuning a plus.
* Container Technologies: Docker, Docker Compose, Dockerfiles, Docker Swarm/Kubernetes preferred.
* High-Availability Configuration: HAProxy/Nginx and general load-balancing. Familiar with different types of clustering solutions, as used by Elasticsearch, RabbitMQ, MongoDB, HAProxy.
* Infrastructure as a Service: Azure preferred, or AWS, Google, Rackspace.
* Virtualization: familiar with using at least one of the following, Hyper-V, VirtualBox, VMWare ESXi 5.5, vCenter, vCloud. Able to perform basic VM tasks, create, clone, copy, template. Snapshot experience is a plus.
* Application Technology Deployment: Java, Tomcat, ElasticSearch, RabbitMQ.
* Logging: familiar with Linux logging, ELK, & Beats.
* Monitoring: familiar with tools such as Icinga, Nagios, and New Relic.
* Networking: Knowledge of IPv4, IPTables, FirewallD.
* Database Administration: MongoDB 3.x, Postgres, MySQL

About You:

* Enjoy working in a fast-paced environment with changing priorities.
* Cultivate collaborative relationships with team members across the DevOps lifecycle.
* Believe in automating where and whenever possible
* Bring a sense of humor and a friendly, collaborative approach to solving problems.
* Seek out opportunities for continual improvement; take ownership and collaborate with your team to implement.
* Communicate openly and effectively, with team members in DevOps/Platform SRE and Product Engineering.
* Discuss your work with team members, ask questions, openly give and receive advice.
* Be disciplined and imaginative in your approach to design and engineering.
* Escalate issues as needed to the senior members of the DevOps team.
* Enjoy working closely with Product Engineering and other operations teams

About Us:

Agilysys (Nasdaq: AGYS) is a leading technology company that provides innovative point-of-sale, property management, inventory and procurement, workforce management, analytics, document management and mobile and wireless solutions and services to the hospitality industry. The company's solutions and services allow property managers to better connect, interact and transact with their customers by streamlining operations, improving efficiency, increasing guest recruitment and wallet share, and enhancing the guest experience. Agilysys serves four major market sectors: Gaming, both corporate and tribal; Hotels, Resorts and Cruise; Foodservice Management; and Restaurants, Universities, Stadia and Healthcare. Agilysys operates throughout North America, Europe and Asia, with corporate services located in Alpharetta, GA. For more information, visit www.agilysys.com.

Let your dream job find you.

Sign up to start matching with top companies. It’s fast and free.