SRE Manager,Azure Cloud Job Chicago area,Illinois USA,IT/Tech

SRE Manager, Azure Cloud page is loaded## SRE Manager, Azure Cloud locations:
Portland, ME:
Chicago, IL:
Boston, MA:
Dallas, TXtime type:
Full time posted on:
Posted Todayjob requisition :
R20925
**** About the Team & Role
**** We are looking for a highly motivated and high-potential Site Reliability Engineering (SRE) Manager to lead a team of engineers, lead impactful initiatives, and further elevate your career in reliability engineering.

This is a transformative moment to be part of the SRE team products support a wide range of customer businesses and generate complex, high-volume telemetry and operational data across systems and platforms. As WEX scales, reliability, performance, and operational excellence are more essential than ever.

As the
** SRE Manager**, you will lead a team of engineers who treat operations as a software problem. You aren't just managing infrastructure; you are the architect of our reliability strategy. Your mission is to balance the velocity of feature delivery with the stability of our Microsoft Azure ecosystem.

You’ll also act as a key partner to engineering and product teams—guiding them on building with reliability in mind, embedding SRE best practices, and influencing platform architecture and operational maturity. We operate with agile methodologies and a product-minded engineering culture, and we leverage modern technologies—including AI—to continuously evolve our reliability capabilities.

You’ll drive solutions to complex challenges with high business impact and collaborate with a team of leaders who will support and challenge you to grow further as a technical and strategic leader.

If you’re passionate about reliability, eager to lead, and ready to make a big impact, this is a great opportunity for you!
**** How you’ll make an impact
****** Team Leadership &*
* **** SRE Advocacy****
* ** Mentorship**:
Lead weekly 1:1s focused on transitioning engineers from traditional ops mindsets to SRE/Dev Ops practices.

** Blameless Culture:
** Drive a "blameless post-mortem" culture where incidents are viewed as opportunities to harden the system rather than find fault.

** Toil Management:
** Actively identify and track "toil" (manual, repetitive work), ensuring the team maximizes their time on engineering projects that eliminate it.
**** Reliability & Operational Strategy****
* ** SLOs & SLIs**:
Define and monitor Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to measure the true health of Azure services from the user's perspective.
* ** Error Budget Oversight:
** Manage error budgets in collaboration with Product Engineering to balance the risk of new deployments against system stability.
* ** Incident Response & Resilience:
** Oversee the 24/7 on-call rotation with a focus on
** observability** (Log Analytics, Azure Monitor, KQL) to reduce Mean Time to Detect (MTTD) and Mean Time to Repair (MTTR).
**** Engineering & Automation****
* ** IaC & Git Ops:
** Lead the standardization of Infrastructure as Code (Terraform/Bicep) and CI/CD pipelines (Git Hub Actions) to ensure all Azure resources are version-controlled and reproducible.
* ** Self-Healing Systems:
** Architect automated remediation workflows to handle common failure modes, reducing the need for human intervention during minor incidents.
* ** Fin Ops & Governance:
** Collaborate with Fin Ops to automate cost-optimization and enforce Azure Policies that prevent "drift" from security and compliance baselines.
**** Experience you’ll bring****
* *
* Experience:

** 5+ years in SRE, Dev Ops, or Cloud Engineering, with 2+ years leading technical teams in high-availability environments.
* ** SRE Mindset:
** Deep understanding of
** SRE
* * principles (Error Budgets, Eliminating Toil, Observability).
* ** Technical Depth:
** Expertise in
** Terraform
* * and
** Git Ops
* * workflows. + Proficiency in
** Python
* * or
* * Go** (for automation) and scripting (Bash/Power Shell). + Strong grasp of Azure-native monitoring (KQL, Prometheus/Grafana integration).
* *
* Soft Skills:

** Ability to negotiate Error Budgets with product owners and translate technical debt into business risk. Experience mentoring and guiding engineers in areas such as…


Increase/decrease your Search Radius (miles)



Job Posting Language