×
Register Here to Apply for Jobs or Post Jobs. X

Reliability Engineering Lead

Job in 500001, Hyderabad, Telangana, India
Listing for: Zyoin Group
Full Time position
Listed on 2026-03-03
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing, SRE/Site Reliability, IT Support
Job Description & How to Apply Below
Reliability Engineering Lead

Location:

Hyderabad
Work Mode:  Hybrid (3 Days Office)

Experience:

12–18 Years
Notice Period:  Immediate – 30 Days

Role Overview
We are looking for a seasoned  Reliability Engineering Lead  to drive reliability strategy, incident excellence, automation maturity, and observability across enterprise digital platforms. This role blends deep technical expertise with governance leadership and is ideal for someone who can translate reliability engineering into measurable business outcomes such as revenue impact, operational efficiency, and user safety.
You will act as the  process owner for reliability frameworks , ensuring systems remain resilient, compliant, scalable, and optimized while enabling engineering velocity.

Key Responsibilities
1. Service Reliability & SLO Framework
Define and implement  SLIs/SLOs  aligned with business impact and operational requirements.
Drive  SLO-based decision making  for releases, prioritization, and incident response.
Establish  error budget frameworks  balancing feature velocity and system reliability.
Build reliability governance aligned with regulatory frameworks (GxP, SOX, etc.).
Translate technical metrics into business-level insights and executive reporting.
2. Incident Management & Learning Culture
Lead structured  incident command processes  for critical outages.
Facilitate  blameless postmortems  to improve systems and foster psychological safety.
Build and maintain incident learning repositories for organizational knowledge sharing.
Implement proactive monitoring systems to detect issues before user impact.
3. Automation & Toil Reduction
Maintain operational toil below  50% workload  through automation initiatives.
Identify and eliminate repetitive tasks using cost-benefit prioritization.
Deliver engineering improvements that enhance performance and reliability quarterly.
Develop self-service documentation, runbooks, and automation tooling.
4. Platform Engineering & AI Reliability
Design reliability frameworks for  AI/ML workloads and data pipelines .
Partner with platform teams to embed reliability into  internal developer platforms (IDPs) .
Support enterprise-scale agentic systems with reliability and compliance alignment.
Improve CI/CD reliability and infrastructure-as-code practices.
5. Observability & Performance Engineering
Implement full-stack observability across  metrics, logs, traces, and business KPIs .
Conduct performance engineering, capacity planning, and bottleneck analysis.
Deploy intelligent monitoring systems with predictive alerting and root cause insights.
Enable cross-system monitoring across cloud, on-prem, and legacy environments.
6. Security & Compliance Alignment
Integrate reliability practices with  Dev Sec Ops  and compliance frameworks .
Automate compliance checks, audit trails, and reporting.
Perform reliability impact assessments for regulated systems.
Design and validate disaster recovery strategies aligned with business and regulatory requirements.

Mandatory Qualifications
12–18 years of experience in  SRE, platform engineering, or reliability engineering .
Proven experience designing enterprise-scale reliability frameworks.
Strong expertise in:
SLO/SLI design
Observability platforms
Incident management
Automation strategies
Hands-on knowledge of distributed systems, cloud platforms, and infrastructure reliability.
Experience working within regulated environments or compliance-driven systems.
Strong stakeholder communication and leadership capabilities.

Why This Role
Strategic leadership opportunity with organization-wide impact.
Ownership of reliability strategy for mission-critical platforms.
High visibility with senior leadership and cross-functional teams.
Ability to influence platform architecture, delivery velocity, and engineering culture.
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary