×
Register Here to Apply for Jobs or Post Jobs. X

Site Reliability Engineer

Job in Bengaluru, 560001, Bangalore, Karnataka, India
Listing for: TecQubes Technologies
Full Time position
Listed on 2026-02-04
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing, SRE/Site Reliability
Job Description & How to Apply Below
Location: Bengaluru

Company Description

Tec Qubes Technologies is a global company dedicated to streamlining business operations and delivering fast, output-driven results. With a dynamic research team, we offer advanced solutions designed to move businesses closer to solid success and a bright future. Our sophisticated technologies bring innovative technology products and services to our clients. Leveraging strong research and market insights, we consistently deliver the best results.

Role Description

We are seeking a highly experienced  Site Reliability Engineer (SRE)  with  10+ years of experience  in designing, implementing, and maintaining highly available, scalable, and resilient systems. The ideal candidate will have deep expertise in  AWS, Kubernetes, Elasticsearch, Grafana , and modern SRE practices, with a strong focus on automation, observability, and operational excellence.

Qualifications

10+ years  of experience in Site Reliability Engineering, Dev Ops, or Platform Engineering.
Strong hands-on experience with  AWS services  (EC2, EKS, S3, RDS, IAM, VPC, Cloud Watch, Auto Scaling).
Advanced expertise in  Kubernetes  (EKS preferred), Helm, and container orchestration.
Deep knowledge of  Elasticsearch  (cluster management, indexing, search optimization, performance tuning).
Strong experience with  Grafana  and observability stacks (Prometheus, Loki, ELK).
Proficiency in  Linux system administration  and networking fundamentals.

Experience with  Infrastructure as Code  tools (Terraform, Cloud Formation).
Strong scripting skills in  Python, Bash, or Go .

Key Responsibilities

Design, build, and operate  highly reliable, scalable, and fault-tolerant systems  in AWS cloud environments.
Implement and manage  Kubernetes (EKS)  clusters, including deployment strategies, scaling, upgrades, and security hardening.
Own and improve  SLIs, SLOs, and SLAs , driving reliability through data-driven decisions.
Architect and maintain  observability platforms  using Grafana, Prometheus, and Elasticsearch.
Manage and optimize  Elasticsearch clusters , including indexing strategies, performance tuning, scaling, and backup/restore.
Develop and maintain  monitoring, alerting, and logging solutions  to ensure proactive incident detection and response.
Lead  incident management , root cause analysis (RCA), postmortems, and continuous improvement initiatives.
Automate infrastructure and operations using  Infrastructure as Code (IaC)  and scripting.
Collaborate with development teams to improve system reliability, deployment pipelines, and release processes.
Implement  CI/CD best practices  and reduce deployment risk through canary, blue-green, and rolling deployments.
Ensure security, compliance, and cost optimization across cloud infrastructure.
Mentor junior SREs and drive adoption of SRE best practices across teams.
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary