Senior Engineer SRE Incident Response; NOC
Listed on 2025-12-28
-
IT/Tech
Cloud Computing, Systems Engineer
Senior Engineer SRE Incident Response (NOC) at GEICO
Base Pay Range$/yr - $/yr
Position SummaryGEICO is seeking an experienced Software Engineer who is passionate about building high-performance, maintainable, and resilient platforms and applications. This role is integral to our ongoing transformation—moving from a traditional IT model to an engineering-driven organization that emphasizes reliability, scalability, and automation.
Position DescriptionSite Reliability Engineering (SRE) blends software engineering and systems administration to design, develop, and manage large-scale, highly available, fault-tolerant systems. SRE ensures that GEICO’s services—both internal and customer-facing—meet reliability, uptime, and performance standards while enabling rapid iteration and continuous improvement.
As an SRE at GEICO, you will tackle the unique challenges of operating at scale, leveraging expertise in coding and large‑scale system design. You will also participate in on‑call rotations, providing incident response, troubleshooting, and post‑mortem analysis to improve system reliability and minimize operational impact.
At GEICO, we foster a culture of collaboration, continuous learning, and technical excellence. We value diversity, problem‑solving, and risk‑taking in a blame‑free environment, empowering engineers to innovate while receiving mentorship and support.
Position Responsibilities- Lead technical initiatives across multiple teams, providing strategic and technical guidance.
- Utilize programming languages like Go, Python, Java, and work with SQL/No
SQL databases. - Work with container orchestration tools such as Docker, Kubernetes, and Open Stack.
- Architect and develop cloud‑native applications using Azure services.
- Collaborate with product managers, engineering teams, and stakeholders to solve complex challenges.
- Ensure the quality, performance, and usability of engineering solutions.
- Serve as a mentor and thought leader, coaching engineers and influencing executives.
- Continuously improve processes, adopt best practices, and drive operational efficiency.
- Support and participate in On Call rotations, responding to incidents, diagnosing production issues, and conducting post‑incident reviews to improve system reliability.
- Expertise in at least two modern programming languages (Go, Python, Java, C, C++) and object‑oriented design.
- Strong ownership and accountability with excellent communication and collaboration skills.
- Hands‑on experience in incident response, troubleshooting, and root cause analysis.
- Experience managing distributed systems in public, private, or hybrid cloud environments.
- Experience with monitoring, logging, and observability tools (Prometheus, Grafana, Open Telemetry, Loki).
- Passion for automation and reducing manual operations using tools like Terraform and Ansible.
- Familiarity with configuration management and orchestration tools (Helm, Puppet, Spinnaker).
- Experience with CI/CD pipelines, Infrastructure as Code (IaC), and cloud‑based deployments.
- Ability to operate in a fast‑paced, high‑scale environment with a problem‑solving mindset.
- 3+ years of professional experience in software development, platform architecture, and infrastructure management.
- 3+ years of experience as either a SRE or Dev Ops team member.
- 3+ years of experience with AWS, GCP, Azure, or hybrid cloud environments.
- 3+ years of experience with open‑source frameworks.
- 3+ years of experience with system architecture and design.
- 3+ years of experience being in an On‑Call rotation.
- Bachelor’s degree in Computer Science, Information Systems, or equivalent work experience.
$ - $
The above annual salary range is a general guideline. Multiple factors are taken into consideration to arrive at the final hourly rate/ annual salary to be offered to the selected candidate. Factors include, but are not limited to, the scope and responsibilities of the role, the selected candidate’s work experience, education and training, the work location as well as market and business considerations.
GEICOPledge
Great Company: At GEICO, we help our customers through life’s twists and turns. Our mission is to protect…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).