Availability & Capacity Management Specialist
Listed on 2026-01-23
-
IT/Tech
Systems Engineer, IT Support, Cloud Computing, IT Project Manager
ROLE PURPOSE
The Availability & Capacity Management Specialist is responsible for the end-to-end operational execution and continual improvement of Availability and Capacity Management practices across Nexio’s/Clients ICT services and infrastructure. The role ensures that IT services and supporting infrastructure are available, performant, and adequately resourced to meet current and forecasted business requirements, in line with agreed Service Level Agreements (SLAs) and ITIL best practices.
The role operates within the Service Assurance function and works closely with Service Management, Technical Operations, Architecture, and Suppliers to proactively manage risk, forecast demand, and optimise service performance and resilience.
General Accountabilities
- Maintain end-to-end accountability for Availability and Capacity Management processes within the Service Operations environment.
- Ensure IT services meet agreed availability and capacity targets as defined in SLAs and OLAs.
- Maintain effective visibility and reporting to Service Assurance and senior stakeholders.
- Understand customer business priorities, service dependencies, and criticality.
- Apply ITIL-aligned best practices, policies, and procedures consistently.
- Influence stakeholders and suppliers where direct authority does not exist.
- Act as a subject matter specialist for Availability and Capacity Management.
AVAILABILITY MANAGEMENT RESPONSIBILITIES
- Ensure existing services deliver agreed availability levels as defined in SLAs.
- Validate availability requirements during the design and introduction of new or changed services.
- Assist in investigating and diagnosing incidents and problems impacting service availability.
- Contribute to infrastructure and solution design by specifying availability and recovery requirements.
- Define monitoring requirements for automated event and availability management systems.
- Specify reliability, maintainability, and serviceability requirements for supplier-provided components.
- Monitor and report actual service availability against SLA targets.
- Proactively identify opportunities to improve service availability and infrastructure resilience.
- Develop, maintain, and execute an Availability Plan aligned to business needs.
- Perform regular reviews and audits of the Availability Management process.
- Define recovery and resilience design criteria for infrastructure and services.
- Support cost justification for availability-related investments in collaboration with Financial Management.
- Maintain and execute availability and resilience testing schedules.
- Support risk assessment and mitigation activities with Security and IT Service Continuity Management.
- Attend CAB meetings to assess and advise on the availability impact of RFCs.
- Act as the escalation point for availability-related issues.
CAPACITY MANAGEMENT RESPONSIBILITIES
- Ensure sufficient IT capacity is available to meet current and future service requirements.
- Advise senior IT stakeholders on capacity optimisation and demand matching.
- Identify capacity requirements through engagement with Service Level Management and business stakeholders.
- Maintain understanding of current infrastructure utilisation and maximum capacity thresholds.
- Perform capacity sizing for new or changed services using modelling and forecasting techniques.
- Forecast future capacity requirements based on business plans, growth trends, and usage data.
- Develop and maintain a Capacity Plan aligned with the organisation’s business planning cycle.
- Ensure appropriate monitoring of system performance and resource utilisation.
- Analyse capacity and performance data and report against SLA and performance targets.
- Raise incidents and problems when capacity thresholds are breached and support root cause analysis.
- Identify and initiate tuning and optimisation activities to improve capacity and performance.
KEY PERFORMANCE INDICATORS (KPIs)
- SLA compliance for availability and performance
- Accuracy and effectiveness of availability and capacity forecasting
- Reduction in availability and capacity-related incidents
- Quality and timeliness of reporting and analysis
- Effectiveness of risk identification and mitigation
- Stakeholder satisfaction…
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: