Site Reliability Engineer
San Jose, Santa Clara County, California, 95199, USA
Listed on 2026-02-24
-
IT/Tech
Systems Engineer, Cloud Computing
What you can expect
What you can expect - As a Site Reliability Engineer, you can anticipate opportunities to work on our hybrid systems across the globe. You will be responsible for installing, configuring, and monitoring new systems within a network of global data centers. Additionally, you will patch and maintain thousands of physical and cloud systems worldwide. To streamline operations, you will develop automation to reduce repetitive tasks and analyze and address performance bottlenecks.
Furthermore, you will update and troubleshoot user access permissions, resolve network connectivity issues, and maintain system firewalls.
Zoom's SRE team is committed to delivering customer happiness, improving business efficiency, and promoting agility through innovation, data-driven insights, and automation. Our impact is reflected in smooth user experiences, optimized processes, and support for Zoom's expansion in the realm of communication and collaboration.
ResponsibilitiesInstalling, configuring, monitoring and maintaining systems within a network of global data centers. Develop automation scripts and tools using Python and Shell to streamline operations and reduce manual intervention. Monitor and analyze system performance metrics to identify and address potential issues proactively. Designing, implementing, and maintaining CI/CD pipelines to enable rapid and reliable software deployments across multiple environments. Monitoring, troubleshooting, and optimizing production systems to ensure uptime for critical Zoom infrastructure.
Collaborating with other teams to troubleshoot system performance issues and promote SRE best practices. Participating in on-call rotation to provide around-the-clock support for production incidents and system emergencies.
- Have a Bachelors or Master’s degree in Computer Science or related major
- Demonstrate 2-5 years of hands-on experience in Site Reliability Engineering, Dev Ops, or Production Operations roles
- Demonstrate proficiency in scripting languages including Python and Shell
- Have experience in Linux systems administration with a focus on Ubuntu
- Able to participate in on-call shifts and incident management and work after hours/weekends for infra change/deployment
- Apply analytical and troubleshooting skills with ability to diagnose complex system issues.
- Have experience with CI/CD pipelines (e.g. Jenkins, Git Lab CI) and version control systems (e.g. Git)
- Have experience with build automation, configuration management tools (e.g. Ansible), and IaC provisioning tools (e.g. Packer/Terraform)
- Have experience with bare metal infrastructure and datacenter operations, including proficiency in operating system deployment tools (Foreman, Cobbler, MAAS etc.)
- Have experience using Kubernetes or Linux certified
Salary Range or On Target Earnings:
Minimum: $87600,00
Maximum: $186000,00
In addition to the base salary and/or OTE listed Zoom has a Total Direct Compensation philosophy that takes into consideration base salary, bonus and equity value.
Note:
Starting pay will be based on a number of factors and commensurate with qualifications & experience.
We also have a location based compensation structure; there may be a different range for candidates in this and other locations.
Closing and Working InformationAnticipated Position Close Date: 03/27/26
Ways of Working - Our structured hybrid approach is centered around our offices and remote work environments. The work style of each role, Hybrid, Remote, or In-Person is indicated in the job description/posting.
BenefitsBenefits - As part of our award-winning workplace culture and commitment to delivering happiness, our benefits program offers a variety of perks, benefits, and options to help employees maintain their physical, mental, emotional, and financial health; support work-life balance; and contribute to their community in meaningful ways.
Note:
Learn more information after joining Zoom.
Zoomies help people stay connected so they can get more done together. We set out to build the best collaboration platform for the enterprise, and today help people communicate…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).