Job Description
What is the opportunity?
This is an exciting opportunity to join a high-performing team that plays a critical role in ensuring the reliability, scalability, and performance of pre‑production environments for ATM systems. As a Senior Service Reliability Developer, you will be at the forefront of driving innovation and operational excellence in a mission‑critical domain. It is ideal for someone who is passionate about building and maintaining highly reliable, scalable, and secure systems while driving automation and operational excellence.
Whatwill you do?
As a Senior Service Reliability Engineer in the ATM team, you will be responsible for ensuring the reliability, performance, and scalability of our production environment. Your day‑to‑day responsibilities will include:
- Architect and develop solutions to enhance reliability and performance across pre‑production and production environments, ensuring seamless ATM operations.
- Design and implement advanced monitoring tools to proactively detect and resolve system issues, maintaining high availability and performance.
- Develop and deploy automation frameworks for deployment, monitoring, and incident response to minimize manual intervention and improve efficiency.
- Build and optimise CI/CD pipelines to enable faster, more reliable software deliveries.
- Lead incident response efforts, conduct root cause analysis (RCA), and implement long‑term fixes to prevent recurrence.
- Participate in on‑call rotations to address high‑severity production incidents, including off‑hours troubleshooting, cross‑functional coordination, and urgent remediation to ensure 24/7 system reliability.
- Partner with operations, QE, and engineering teams to align on best practices, share knowledge, and ensure smooth system integration.
- Champion Dev Ops and SRE principles by fostering collaboration, continuous improvement, and Infrastructure as Code (IaC).
- Mentor junior team members, share expertise, and contribute to team growth and technical excellence.
To excel as a Senior Service Reliability Developer in the ATM team, you must have the following skills and expertise:
- Proficiency in building and managing CI/CD pipelines, infrastructure as code (IaC) tools (e.g., Terraform, Ansible), and automation frameworks to streamline deployments and operations.
- Experience with monitoring tools (e.g., Prometheus, Grafana, Splunk) and the ability to diagnose and resolve complex system issues in real‑time.
- Strong coding skills in languages such as Python, Go, or Java, along with scripting expertise in Bash, Power Shell, or similar.
- Hands‑on experience with cloud platforms (e.g., AWS, Azure, GCP), containerization tools like Docker and Kubernetes, and a solid understanding of networking concepts (e.g., DNS, load balancing, firewalls, and VPNs) to ensure secure and efficient system communication.
- A proactive approach to identifying and solving technical challenges, coupled with the ability to work effectively in cross‑functional teams.
- Core Technologies:
- Advanced MDT task sequence creation and troubleshooting
- Site administration, software deployment, operating system deployment
- DISM, WIM file management, sysprep automation
- Advanced scripting for automation and customization
- Patch management and driver integration
- MSI creation, App‑V, application compatibility
- Containerization basics
- Monitoring solutions (Prometheus, Grafana, or similar)
- Familiarity with advanced networking protocols, SDN (Software‑Defined Networking), and network performance optimisation techniques.
- Understanding of security principles, including vulnerability management, secure coding practices, and compliance standards (e.g., PCI DSS).
- Hands‑on experience with observability platforms (e.g., Datadog, New Relic, or Elastic Stack) to gain deeper insights into system performance and reliability.
- Advanced scripting skills (e.g., Python, Bash, Power Shell) and experience in creating automation scripts for tasks such as system monitoring, deployment, and incident response.
We thrive on the challenge to be our best, progressive thinking to keep growing, and working together to…
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: