System Reliability Engineer, Infrastructure R&D
Listed on 2026-03-01
-
IT/Tech
Systems Engineer, Cloud Computing, Data Engineer
Veeam, the #1 global market leader in data resilience, believes businesses should control all their data whenever and wherever they need it. Veeam provides data resilience through data backup, data recovery, data portability, data security, and data intelligence. Based in Seattle, Veeam protects over 550,000 customers worldwide who trust Veeam to keep their businesses running. Join us as we move forward together, growing, learning, and making a real impact for some of the world’s biggest brands.
The future of data resilience is here - go fearlessly forward with us.
R&D Infrastructure is a dedicated, highly isolated environment designed to support the R&D department’s unique needs. Our team manages a wide range of production and lab equipment, ensuring reliable operation, scalability, and fault tolerance. We provide full-cycle support for Azure Dev Ops Server, including project creation, workflow tuning, custom permissions, backup, and migration.
With a large and diverse fleet of build servers—including rare hardware—we select, deploy, and balance equipment for specialized tasks, maintaining close collaboration with R&D teams. We also manage shared storage solutions for build artifacts, enabling efficient replication and seamless load balancing across locations.
What You’ll Do- Deploy and manage physical and virtual infrastructure for R&D teams, from bare-metal server setup to high-density, heterogeneous virtualized clusters
- Be available for periodic on-site visits to data centers to support physical hardware deployment, maintenance, and issue resolution
- Administer and support Azure Dev Ops Server (On-Premises and Cloud) for source code version control
- Assist R&D teams with troubleshooting and optimizing build processes
- Diagnose and resolve performance issues in high-utilization virtualization clusters and storage systems
- Design optimized, purpose-specific server and storage hardware configurations in collaboration with procurement teams
- Investigate and resolve issues reported by R&D teams and automated monitoring tools through thorough root cause analysis
- Contribute to the design and implementation of disaster recovery strategies
- Maintain and enhance internal documentation
- Identify and implement opportunities for process automation and efficiency improvements
- Self-sufficient, proactive, and results oriented
- Strong verbal and written communication skills, with the ability to explain complex topics to audiences with varying levels of technical expertise
- 5+ years of experience administering and troubleshooting Active Directory, Hyper-V, SQL Server, and VMware vSphere products
- 3+ years of experience designing, implementing, and troubleshooting sophisticated, highly utilized virtualization clusters built on shared storage and complex network topology
- 3+ years of experience administering Azure Dev Ops Server (Microsoft Team Foundation Server), including data migration between different platform versions
- Experience administering Microsoft Azure
- Experience writing advanced Power Shell scripts, including those that utilize 3rd-party modules
- Experience configuring monitoring systems from scratch, with a focus on optimizing triggers and alerts
- Deep knowledge of the OSI model and network traffic virtualization
- Familiarity with
* nix systems such as Linux, macOS, and AIX - Familiarity with Git and Team City
- Experience designing and implementing Disaster Recovery Plans
- Familiarity with off-site and GFS backup strategies using Veeam products such as Backup & Replication and Veeam Agents
- Familiarity with the technical nuances of software development (from source code to RTM product)
- Familiarity with hardware capacity planning and procurement processes in large organizations
- Unlimited paid time off, plus 3 global Veea Me Days for self-care
- Paid parental leave: 8 weeks for all parents, 16 weeks for birthing parents
- Medical, dental, and vision coverage from day one
- Mental health support, therapy sessions, and digital wellness tools via Support Linc EAP
- 401(k) retirement plan with matching contributions up to annual limits
- Fertility, adoption, and surrogacy support through Maven, plus…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).