Site Reliability Engineer
Job in
New York, New York County, New York, 10261, USA
Listed on 2026-03-01
Listing for:
Mphasis
Full Time
position Listed on 2026-03-01
Job specializations:
-
IT/Tech
Cloud Computing, SRE/Site Reliability, Systems Engineer, Network Engineer
Job Description & How to Apply Below
We are seeking a highly skilled Site Reliability Engineer (SRE) to join our Infrastructure Management team. The ideal candidate will be responsible for automating processes, enhancing system reliability, and reducing operational toil through innovative solutions. This role requires a strong foundation in scripting and automation tools, with a focus on creating self‑healing systems that ensure optimal performance and availability.
Responsibilities- Design, implement, and maintain automation frameworks to improve system reliability and performance.
- Develop and manage scripts using Bash, Shell, and Python to automate routine tasks and processes.
- Utilize Ansible for configuration management and deployment automation.
- Implement auto‑healing mechanisms to proactively address system failures and reduce downtime.
- Collaborate with development and operations teams to identify and eliminate toil in existing processes.
- Monitor system performance and reliability metrics, providing insights and recommendations for improvements.
- Participate in on‑call rotations to support production systems and respond to incidents as needed.
- Document processes, procedures, and best practices to ensure knowledge sharing within the team.
- Stay current with industry trends and emerging technologies to continuously enhance our infrastructure capabilities.
- Proven expertise in Site Reliability Engineering (SRE) principles and practices.
- Strong scripting skills in Bash, Shell, and Python.
- Experience with automation tools, particularly Ansible.
- Solid understanding of system architecture, networking, and cloud technologies.
- Ability to troubleshoot complex systems and provide effective solutions under pressure.
- Excellent communication and collaboration skills, with a focus on teamwork.
- Familiarity with containerization technologies such as Docker and orchestration tools like Kubernetes.
- Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
- Knowledge of CI/CD pipelines and Dev Ops practices.
- Understanding of security best practices in infrastructure management.
- Bachelor's degree in Computer Science, Information Technology, or a related field.
- Relevant certifications in cloud technologies, automation, or SRE are a plus.
Demonstrated ability to work in a fast‑paced, dynamic environment.
#J-18808-LjbffrTo View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×