Senior Manager, SRE & Networking
Listed on 2026-02-28
-
IT/Tech
Cloud Computing, Systems Engineer, SRE/Site Reliability, IT Project Manager
At F5, we strive to bring a better digital world to life. Our teams empower organizations across the globe to create, secure, and run applications that enhance how we experience our evolving digital world. We are passionate about cybersecurity, from protecting consumers from fraud to enabling companies to focus on innovation.
Everything we do centers around people. That means we obsess over how to make the lives of our customers, and their customers, better. And it means we prioritize a diverse F5 community where each individual can thrive.
About the RoleWe are seeking a highly experienced Senior Manager to lead our Platform SRE, Virtualization, Networking, and AI Infrastructure organizations. This leader will oversee teams operating mission‑critical infrastructure across:
- Kubernetes platforms: Open Shift, Titan‑k8s, Robin, Vanilla Kubernetes
- Virtualization & hypervisors: Proxmox, VMware, XCP‑ng, KVM
- Private cloud platforms: Open Stack
- Networking: Data center & cloud networking, L4/L7 services, Kubernetes CNI/service mesh
- AI/GPU compute: BNK‑AI‑LAB & TMOS AI Lab clusters
This role is responsible for multi‑team leadership
, strategic platform roadmap
, operational excellence
, and end‑to‑end reliability across hybrid compute environments (VMs, containers, and AI workloads).
You will partner closely with Engineering, Cloud, Security, and Architecture leaders to deliver reliable, scalable, and developer‑friendly platforms.
What You’ll Lead- Multi‑team ownership:
SRE, Networking, Virtualization, AI/GPU Infrastructure
- Lead hybrid data centers — spanning routing, switching, firewalls, SDN/overlay, Kubernetes CNI, and service‑mesh/L4‑L7 traffic — to drive network reliability, performance, security, and automation.
- Reliability strategy: SLO/SLI programs, incident management, automation, scaling
- Kubernetes platform operations across multiple distros
- Virtualization & private cloud:
Proxmox, VMware, XCP‑ng, KVM, Open Stack
- Provide executive oversight for Open Stack compute storage, and networking services.
- Ensure scalable VM lifecycle management, resource optimization, and operational maturity.
- Networking: datacenter/cloud networking, CNI, service mesh, L4/L7 traffic
- Own end‑to‑end reliability and performance of AI compute platforms, including model training/inference pipelines, GPU scheduling and autoscaling, and high‑performance compute environments
- Partner with ML, Data, and Product to build next‑gen AI compute platforms.
- Drive adoption of automation‑first operations, Git Ops , and infrastructure‑as‑code.
- Own the multi‑year platform roadmap across hybrid compute, Kubernetes, virtualization, AI, and networking while driving cross‑org alignment and leading large‑scale modernization across CI/CD, observability, and infrastructure.
- Drive organizational strategy, prioritization, staffing plans, hiring, and budgeting.
- Build a high‑performance, inclusive culture focused on ownership, excellence, and continuous improvement.
- 10+ years infrastructure/SRE/platform engineering experience
- 5+ years managing engineering teams (including managers or tech leads)
- Deep experience with Kubernetes
, virtualization, and cloud/networking
- Strong leadership, communication, and cross‑functional alignment
- Proven record of accomplishment improving platform uptime, performance, and reliability
- 10+ years of experience in SRE, Platform Engineering, Virtualization, Networking, or Infrastructure.
- 5+ years managing engineering teams.
- Proven leadership in:
- Kubernetes platforms (Open Shift, Titan‑k8s, Robin, Vanilla K8s)
- Virtualization (Proxmox, VMware, XCP‑ng, KVM)
- Open Stack (Nova, Neutron, Cinder, Keystone)
- Data center/cloud networking and distributed systems
- Strong executive communication skills and cross‑org influencing ability.
- Demonstrated experience improving operational maturity and reliability for large‑scale systems.
- Strong background in automation, CI/CD, observability, and infrastructure architecture.
- Experience running large-scale multi‑cluster Kubernetes environments.
- Experience with service mesh
, ingress controllers, and network policy frameworks.
- Familiarity with GPU scheduling
, Ray, Kubeflow,…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).