Senior Manager SRE Cloud Operations Job Annapolis area,Maryland USA,IT/Tech

Oracle Cloud Infrastructure (OCI) is seeking an accomplished Senior Manager of Software Development with a strong background in both software engineering and cloud operations. In this role, you will lead a high-performing Software Reliability Engineers (SRE) and Dev Ops team responsible for designing, building, and operating highly available, scalable, and resilient cloud services operations automation and tools. You will be accountable not just for automation solutions, but also for the 12x7 operational health, performance, and efficiency of your services operation.

You will enable world-class customer experiences by setting operational standards, ensuring rapid detection and resolution of incidents, and continually driving for service operation excellence, automation, and efficiency. You will partner closely with Service (Product) and Support teams to deliver new solutions at scale, ensuring robust monitoring, alerting, and operational runbooks are in place.

Minimum Qualifications

Bachelor's or master's degree in computer science, Engineering, or relevant field, or equivalent experience.
3+ years’ technical or people management experience in cloud or SRE organizations.
10+ years’ experience in software engineering, site reliability engineering, or IT operations for large-scale, distributed, multi-tenant services.
Demonstrated ownership of 24x7 operational services, including monitoring, incident response, and continuous improvement.
Knowledge in at least one major language (Java, C, C++, Python) and in operational scripting.
Solid grasp of distributed systems, networking, operating systems, and security fundamentals.
Experience with automation, deployment pipelines, service telemetry, and operational dashboards.
Strong communication and stakeholder management skills.

Preferred Qualifications

7+ years' operating and supporting cloud infrastructure or large SaaS environments.
Deep hands-on experience in operational tools, runbook development, and incident management frameworks.
Experience with cost management and operational efficiency at scale.
Familiarity with container orchestration, configuration management, and infrastructure-as-code.
Experience building and scaling geographically distributed teams, and managing complex on-call schedules.

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language