More jobs:
Infrastructure Reliability Engineering, Senior Manager
Job in
Greater London, London, Greater London, W1B, England, UK
Listed on 2026-02-08
Listing for:
London Metal Exchange Limited
Full Time
position Listed on 2026-02-08
Job specializations:
-
IT/Tech
Systems Engineer, Cybersecurity -
Engineering
Systems Engineer, Cybersecurity
Job Description & How to Apply Below
** Overall Purpose of Role
** role is accountable for Infrastructure Reliability Engineering (IRE) function, embedding reliability engineering as a core discipline across the technology lifecycle, from design through live operation, in support of To provide senior leadership across Infrastructure Reliability Engineering, accountable for the resilience, availability, and operational readiness of the LME Group technology estate. Lead the design and delivery of complex infrastructure transformation, platform modernisation, and re-architecture initiatives, ensuring secure, compliant, and highly reliable services that support trading critical operations and regulatory obligations.
** Responsibilities:
** Establish, mature, and continuously evolve the Infrastructure Reliability Engineering function, defining the IRE operating model, engagement patterns, and service boundaries across infrastructure, architecture, operations, security, and application teams. Set, maintain, and enforce consistent reliability engineering standards, patterns, and tooling across the infrastructure estate, balancing resilience, regulatory assurance, and operational efficiency.
Act as senior Infrastructure Reliability Engineering SME across major programmes end‑to‑end (discovery, dependency mapping, design, planning, build, cutover, fall‑back), with direct accountability for service stability and risk reduction for trading‑critical platforms. Act as the accountable owner for Infrastructure Operational Readiness, ensuring platforms and services do not transition into live operation without meeting mandated readiness, observability, recoverability, and supportability criteria. Define and embed a consistent reliability measurement framework across infrastructure platforms, including service level indicators, objectives, and leading indicators of operational risk, enabling data driven prioritisation and informed investment decisions.
Build, lead, and develop a high performing Infrastructure Reliability Engineering team, defining clear role expectations, capability standards, and development pathways.
Foster a culture of engineering excellence, shared ownership, and continuous improvement, ensuring operational knowledge and resilience capability are institutionalised and not dependent on individuals. Act as a senior authority on infrastructure resilience and operational risk, influencing strategic decisions, architectural direction, and investment priorities to ensure reliability is designed in, not retrofitted. Own measurable infrastructure reliability outcomes, including availability, resilience, recovery performance, and operational risk reduction, with regular executive level reporting against agreed targets.
Define and drive the LME Infrastructure Reliability posture, including fault tolerance, redundancy, capacity planning, disaster recovery, and failover strategies across on‑prem and hybrid environments. Ensure infrastructure platforms meet security and compliance requirements (e.g. CIS, ISO 27001, NIST), covering identity and access management, encryption, auditability, and regulatory evidence.
** Academic and Professional Qualifications
Required:
** Demonstrable track record of continuous professional development in infrastructure, solutions engineering, or technology transformation.
** Required Knowledge and Level of
Experience:
** 10+ years of experience leading large scale Infrastructure or Reliability Engineering functions, with demonstrable accountability for the availability, resilience, and operational performance of mission critical systems.
Proven experience establishing, scaling, or materially maturing an Infrastructure Reliability, Platform Reliability, or equivalent function within a complex, regulated, or high availability environment.
Significant experience operating in regulated or high assurance environments (e.g. financial services, exchanges, clearing, or equivalent).Experience influencing senior leadership and steering complex transformation initiatives across multiple technology domains. Significant experience leading or assuring large scale, enterprise Linux estates (e.g. RHELbased), including responsibility…
Position Requirements
10+ Years
work experience
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×