Senior Mainframe Site Reliability Engineer
Listed on 2026-02-28
-
IT/Tech
Systems Engineer, Cloud Computing, IT Support, SRE/Site Reliability
Nexora Mind LLC is a dedicated IT Staffing and Consulting company that connects businesses across the USA with top-tier technology talent. We specialize in supplying pre-vetted IT professionals for contract, contract-to-hire, and full-time positions, ensuring fast and efficient placement. With a strong recruiting network and expert market insights, Nexora Mind LLC delivers tailored staffing and consulting solutions. Our approach emphasizes building meaningful, long-term partnerships that support organizations in achieving their technology and business objectives.
OverviewJob Title:
Senior Mainframe Site Reliability Engineering (SRE)
Client is okay with 3-4 weeks of remote to let the candidate relocate before they join in Memphis,
Role SummaryThe Mainframe SREis responsible for ensuring the reliability, availability, performance, and scalability of enterprise mainframe platforms. This role blends traditional mainframe engineering with modern SRE principles, focusing on automation, observability, incident management, and continuous improvement. The lead will guide a team of engineers while partnering closely with application, infrastructure, and operations teams.
Key Responsibilities- Lead the Mainframe SRE team, providing technical direction, mentoring, and performance guidance
- Own the reliability, availability, and resilience of mainframe environments (z/OS and related subsystems)
- Define and implement SRE practices such as SLIs, SLOs, SLAs, error budgets, and reliability metrics
- Drive automation to reduce manual operations, improve recovery time, and enhance system stability
- Oversee monitoring, alerting, and observability for mainframe systems using modern and legacy tools
- Lead incident management, root cause analysis (RCA), and post-incident reviews
- Partner with application development teams to improve reliability, performance, and deployment practices
- Plan and execute capacity management, performance tuning, and workload optimization
- Ensure compliance with security, regulatory, and audit requirements
- Lead disaster recovery (DR) planning, testing, and high-availability strategies
- Champion continuous improvement, Dev Ops, and SRE culture within mainframe operations
- 10+ years of experience in mainframe systems engineering or operations
- Strong hands-on expertise with IBM z/OS
- Experience with core mainframe components such as:
- CICS, IMS, DB2
- JES2/JES3
- MQ, SMF, SDSF
- Solid understanding of mainframe performance tuning and capacity planning
- Experience leading production support and managing major incidents
- Strong scripting and automation skills (REXX, JCL, CLIST, Python, or equivalent)
- Familiarity with monitoring and scheduling tools (e.g., OMEGAMON, CA/BMC tools, Control-M)
- Experience applying SRE principles
in a mainframe or hybrid (mainframe + distributed) environment - Exposure to Dev Ops, CI/CD, and automation frameworks
- Knowledge of Linux on Z and cloud integration patterns
- Experience with resilience engineering, chaos testing, or fault injection concepts
- Prior people-lead or technical-lead experience
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).