More jobs:
SRE Engineer; Azure
Job in
Greater London, London, Greater London, EC1A, England, UK
Listed on 2026-01-14
Listing for:
QONSULT SYSTEMS PTE. LTD.
Full Time
position Listed on 2026-01-14
Job specializations:
-
IT/Tech
Systems Engineer, Cloud Computing, SRE/Site Reliability
Job Description & How to Apply Below
Location: Greater London
SRE Engineer (Azure)
We are looking for an Azure Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of our cloud platforms. The SRE Engineer will architect, implement, and operate highly available systems with a strong emphasis on automation,observability, and security best practices.
The candidate will work closely with engineering and project teams to ensure our Azure services meet organizational objectives for performance, resilience, and cost-efficiency.
Responsibilities- Demonstrate expertise in cloud reliability engineering, high-availability patterns, observability frameworks, and automation with a security-first mindset.
- Design, implement, and maintain SLOs, SLIs, monitoring dashboards, and automated alerting mechanisms across Azure services.
- Ensure reliability of mission-critical systems by implementing autoscaling, redundancy, failover, and resilient architectures.
- Develop automation using Terraform/Bicep, Power Shell, and Python to reduce operational toil and improve system reliability.
- Collaborate with engineering teams to support secure, reliable CI/CD pipelines and deployment processes.
- Conduct root cause analysis (RCA), implement corrective actions, and lead continuous improvement of reliability processes.
- Continuously monitor Azure resources and optimize performance, cost, and operational health based on best practices.
- Ensure all deployed workloads comply with cloud security baselines, network boundary controls, and governance frameworks (e.g., IM8, CIS, NIST).
- Improve infrastructure readiness through chaos engineering, failover tests, and resilience validation.
- Prepare operational runbooks, architecture documents, and technical guides for cloud reliability operations.
- Support Agile workflows and collaborate across teams to integrate operational excellence into the development lifecycle.
- Bachelor’s Degree in Computer/Information Science or equivalent.
- 4+ years of experience in cloud reliability/SRE role with emphasis on Azure.
- Strong understanding of Azure Monitor, Log Analytics, App Insights, AKS, VNets, Load Balancers, and HA designs.
- Hands‑on experience with IaC tools such as Terraform, Bicep, or ARM templates.
- Strong scripting capabilities (Power Shell/Python).
- Experience with CI/CD pipelines (Git Hub Actions, Azure Dev Ops).
- Solid understanding of cloud security controls, compliance frameworks, and incident management.
- Exceptional troubleshooting and problem‑solving skills.
- Incident & Problem Management
- Configuration & Change Management
- Observability and Reliability Engineering
- Strong communication & stakeholder engagement
- Ability to work effectively across technical teams
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×