Enterprise Production Stability & Resiliency Lead; Mainframe Focus
Listed on 2026-03-08
-
IT/Tech
Systems Engineer, Cloud Computing, IT Project Manager, IT Support
Job Title
- Enterprise Production Stability & Resiliency Lead (Mainframe Focus)
Project
Location:
Remote (EST ZONE)
Duration: 6+ months contract
Visa: USC/GC/GCEAD/H1B
Position SummaryWe are seeking an experienced Enterprise Production Stability & Resiliency Lead to support cross-technology production environments across cloud, mainframe, and distributed systems. This role is not hands‑on development and does not own application code. Instead, it functions as a center‑of‑excellence partner to technology teams, ensuring enterprise production stability, proactive resiliency practices, and continuous improvement. The ideal candidate brings deep enterprise production support experience, strong mainframe expertise (z/OS, z/OS Connect, MQ, etc.),
and the ability to lead root cause investigations and systemic improvements across multiple technology domains. This role will support stability initiatives across multiple portfolios and technologies, including legacy and modern platforms, with particular emphasis on mainframe environments.
- Production Incident & Stability Management:
Serve as a senior stability partner when production issues occur. Quickly assess enterprise‑level production incidents across technologies. Facilitate and/or lead Root Cause Analysis (RCA) sessions. Ensure systemic fixes are implemented to prevent recurrence. Identify broader resiliency gaps beyond the immediate incident. - Enterprise Resiliency & Continuous Improvement:
Review production incidents to identify patterns and systemic risk. Drive preventive improvements across multiple technology teams. Establish and enforce stability and resiliency standards. Support Medicaid and other enterprise programs with stability initiatives. Mainframe Stability Leadership. - Provide subject matter expertise for mainframe environments including: z/OS, z/OS Connect, IBM MQ Enterprise integrations. Bridge knowledge gaps where internal mainframe expertise is limited. Partner with development and infrastructure teams to strengthen mainframe resiliency practices.
- Monitoring & Operational Governance:
Ensure all applications have effective monitoring coverage, documented runbooks/RTO playbooks, up‑to‑date application handbooks, certificate lifecycle management and renewals. Identify and close operational gaps before incidents occur. Validate that operational excellence standards are met across portfolios. - Cross‑Team
Collaboration:
Act as a stability center‑of‑excellence partner (not code owner). Collaborate with cloud, mainframe, infrastructure, and application teams. Influence teams toward proactive production readiness practices. Provide leadership during high‑severity production events.
- 10+ years of enterprise IT experience with strong production support background.
- Extensive experience supporting enterprise‑scale production environments.
- Strong mainframe experience including: z/OS, z/OS Connect, IBM MQ.
- Proven experience leading Root Cause Analysis (RCA) efforts.
- Experience implementing systemic production stability improvements.
- Strong understanding of enterprise monitoring frameworks.
- Experience supporting highly regulated environments (healthcare preferred).
- Ability to operate across multiple technology stacks (cloud + legacy).
Experience working within the Aetna or similar healthcare enterprise landscape.
Experience supporting Medicaid programs.
Familiarity with enterprise certificate management processes.
Experience establishing operational governance standards.
Background in cloud production support (AWS/Azure preferred).
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).