More jobs:
Job Description & How to Apply Below
Location:
Hyderabad
Responsibilities
Client Ownership & Service Delivery
Act as the primary L3 technical lead for the client and ensure end-to-end delivery of Azure, GCP, and Windows managed services.
Own SLA adherence, service health, operational KPIs, and incident/problem outcomes across both clouds.
Lead client governance: daily/weekly status calls, monthly service reviews and reporting, Continuous service improvement plans.
Understand client expectations, translate them into actionable technical deliverables (roadmap/backlog), and ensure timely execution with measurable outcomes.
Ensure effective stakeholder management: clear communication, expectation alignment, risk visibility, and proactive advisories to improve customer confidence.
Technical Leadership (Azure + GCP + Windows)
Provide L3 escalation support for complex Azure, GCP, and Windows issues across compute, and platform services.
Drive design, implementation, and optimization of:
Azure (IaaS/PaaS)
Azure VMs, Scale Sets, Storage (Blob/Files/Disks), VNets, Load Balancer/App Gateway, NSG/ASG, Private Endpoints, DNS.
Backup/DR (Azure Backup, ASR), Monitoring (Azure Monitor, Log Analytics, Alerts), Automation (Runbooks/Functions), Governance (Policy/Blueprints where applicable).
GCP (IaaS/PaaS)
Compute Engine, Instance Groups, Cloud Storage, VPC, Cloud VPN/Interconnect, Load Balancing, Cloud DNS, Firewall rules, Private Service Connect (where applicable).
IAM, Organization policies, Resource hierarchy (Org/Folders/Projects), service accounts, key management (Cloud KMS), Secret Manager.
Observability & operations:
Cloud Monitoring, Cloud Logging, alert policies, log-based metrics, uptime checks; integration to SIEM (e.g., Splunk) when required.
Windows / Microsoft Platform
Windows Server administration: AD DS, DNS, GPO, CA/certificates (as applicable), Build servers, Migrate to updated OS versions, IIS, patching, performance tuning, hardening, troubleshooting.
Security baselines and vulnerability remediation support (OS and platform-level).
Perform deep root cause analysis (RCA) for major incidents and ensure preventive controls are implemented (monitoring, automation, configuration hardening).
Lead platform improvements: standardization, automation, hardening, monitoring improvements, reliability enhancements, and cross-cloud best practices.
Team Management & Leadership
Manage engineers (L1/L2/L3 as applicable) under the client account across Azure + GCP + Windows scope: task allocation, mentoring, quality reviews, and capability building.
Ensure shift/roster coverage, on-call effectiveness, timely ticket handling, and consistent quality output.
Drive operational discipline: runbooks, SOPs, KB articles, handovers, cross-training, and continuous skill uplift (cloud + Windows + Security).
Coach the team on incident handling, change hygiene, documentation standards, and customer communication etiquette.
Incident, Problem, Change & Risk Management
Lead Major Incident Management (bridge calls, triage, comms, workaround, resolution, post-incident review) across Azure/GCP/Windows.
Own Problem Management: trend analysis, recurring incidents, permanent fixes, Known Errors and preventive action tracking.
Ensure effective Change Management: risk assessment, CAB support, rollout/backout plans, pre/post validation, and change documentation for both clouds and OS changes.
Identify risks proactively and implement mitigation actions:
Availability & resilience (multi-zone/region strategy where applicable)
Security & compliance (policy enforcement, IAM hygiene, key/secret controls)
Performance & capacity (scale patterns, bottleneck fixes)
Operational risk (single points of failure, process gaps, skill gaps)
Consultation & Advisory
Provide consultative support to client stakeholders:
Architecture guidance, best practices, migration/modernization suggestions across Azure and GCP
Security controls, governance, compliance alignment (RBAC/IAM, policies, logging, encryption)
Backup/DR strategy, resilience recommendations, RTO/RPO alignment
Participate in technical discussions, solution workshops, and roadmap planning.
Propose and drive improvements: automation opportunities, tooling enhancements, operational maturity uplift, and reliability engineering practices.
Documentation & Process Excellence
Maintain and continuously improve:
Architecture diagrams (Azure + GCP + Windows)
SOPs/runbooks, escalation matrix, support model, service catalog, operational reports (weekly/monthly)
Configuration standards, hardening checklists, monitoring & alerting catalog, backup/DR runbooks
Drive continuous improvement initiatives:
Automation (scripts, runbooks, functions, pipelines)
Monitoring enhancements (signal-to-noise tuning, SLO-based alerting)
Standard configurations and policy enforcement (Azure Policy / GCP Org Policies)
Audit readiness support (access reviews, logging evidence, compliance reporting)
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×