×
Register Here to Apply for Jobs or Post Jobs. X

Lead - Capacity & Automation; SRE

Job in Manchester, Greater Manchester, M9, England, UK
Listing for: BT Group
Full Time position
Listed on 2026-01-10
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing
Job Description & How to Apply Below
Position: Lead - Capacity & Automation (SRE)

Overview

Own the Private Cloud "EC.3" Capacity Management Platform - act as the single accountable owner for capacity planning, forecasting, modelling, and optimisation across the VMware-based Enterprise Cloud v3 environment. Define and Deliver the Capacity Roadmap - translate business demand and programme milestones into a prioritised backlog of features and automation, using Agile delivery practices. Implement SRE Guardrails - establish SLIs, SLOs, and error budgets for infrastructure-related reliability;

ensure proactive risk management. Develop Forecasting Models - build accurate short-, medium-, and long-term capacity forecasts using telemetry and scenario analysis to prevent saturation and ensure headroom. Automate Capacity Workflows - create scripts, policies, and integrations for rightsizing, placement, and quota enforcement using Power

CLI, APIs, and IaC. Maintain Real-Time Telemetry & Dashboards - provide a single source of truth for utilisation, trends, and optimisation opportunities through VMware Aria Operations (vROps) and reporting tools. Optimise Cost and Efficiency - align with Fin Ops principles to deliver show back/chargeback reporting, identify waste, and implement cost-saving measures without compromising reliability. Integrate with ITSM & Governance - ensure Service Now CMDB accuracy, automate request fulfilment, and maintain compliance with capacity policies and audit requirements.

Collaborate Across Teams - work closely with Architecture, Programme Delivery, Finance, and Operations to align capacity decisions with strategic objectives and risk appetite. Continuously Improve - evolve the capacity management capability through iterative enhancements, stakeholder feedback, and adoption of emerging best practices.

Leadership Accountabilities
  • Vision & Strategy - Define and communicate the long-term vision for capacity management on EC.3, ensuring alignment with business objectives and technology strategy.
  • Ownership & Accountability - Act as the single point of accountability for capacity planning, forecasting, and optimisation across the VMware platform.
  • Influence & Stakeholder Engagement - Build strong relationships with senior stakeholders, program leads, and cross-functional teams to drive decisions and secure buy-in.
  • Agile Leadership - Champion Agile ways of working, ensuring backlog prioritisation, iterative delivery, and continuous improvement of the capacity capability.
  • Reliability Governance - Embed SRE principles into leadership decisions, balancing innovation with risk management through SLIs, SLOs, and error budgets.
  • Financial Stewardship - Lead cost optimisation initiatives aligned with Fin Ops principles, ensuring efficient use of resources and transparent reporting.
  • Team Enablement - Mentor and guide engineers and analysts, fostering a culture of automation, data-driven decision-making, and operational excellence.
  • Change Leadership - Drive adoption of new processes, tools, and automation across teams, ensuring smooth transitions and minimal disruption.
  • Executive Communication - Provide clear, concise updates on capacity health, risks, and roadmap progress to senior leadership and governance boards.
  • Continuous Improvement - Lead retrospectives and postmortems to identify systemic improvements and embed lessons learned into future planning.
Key Decisions
  • Capacity Headroom Policy - Define minimum thresholds for CPU, memory, and storage across clusters to ensure reliability and performance.
  • Forecasting Approach - Select and implement the models and tools used for short-, medium-, and long-term capacity planning.
  • Automation Priorities - Decide which manual processes to automate first (e.g., rightsizing, placement, quota enforcement) to reduce toil and improve efficiency.
  • SLO & Error Budget Targets - Set reliability objectives for capacity-related metrics and determine acceptable risk levels for change management.
  • Optimisation Strategy - Choose cost-saving measures (e.g., rightsizing, decommissioning, reserved capacity) while balancing performance and resilience.
  • Tooling & Integration Choices - Determine which platforms (e.g., VMware Aria Operations, Service Now, Power BI) and scripts will form the core of the capacity management capability.
  • Governance & Compliance Controls - Establish policies for capacity requests, approvals, and audit readiness.
  • Reporting & Communication Cadence - Decide how often and in what format capacity health, risks, and forecasts are shared with stakeholders.
  • Change Freeze & Risk Mitigation - Make calls on when to pause non-essential changes based on capacity risk or error budget breaches.
  • Continuous Improvement Roadmap - Prioritise enhancements to forecasting accuracy, automation coverage, and stakeholder experience.
Experience you’d be expected to have
  • Proven track record in capacity management for large-scale VMware environments (vSphere, vCenter, vSAN, NSX-T).
  • Hands-on experience with VMware Aria Operations (vROps) or similar tools for capacity analytics, forecasting, and…
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary