Lead Site Reliability Engineer/Domains
Listed on 2026-01-12
-
IT/Tech
Systems Engineer, Cloud Computing, SRE/Site Reliability, IT Support
Lead Site Reliability Engineer - Applications/Domains Who we are
Collaborative. Respectful. A place to dream and do. These are just a few words that describe what life is like one of the world’s most admired brands, Toyota is growing and leading the future of mobility through innovative, high‑quality solutions designed to enhance lives and delight those we serve. We’re looking for talented team members who want to Dream. Do. Grow.
with us.
Toyota Financial Services (TFS) is a separate business entity dedicated to delivering customer experience solutions for Toyota and Lexus in North America. TFS is part of this world‑changing company and your role will help create best‑in‑class customer experience in an innovative, collaborative environment. To save time applying, Toyota does not offer sponsorship of job applicants for employment‑based visas or any other work authorization for this position at this time.
WhoWe’re Looking For
We are building a new Site Reliability Engineering (SRE) team for Domain Applications and we are seeking a Lead SRE engineer to ensure reliability, performance and availability of the applications within each domain. As a Lead SRE engineer - applications, you will work with development engineers, product owners, SRE Infrastructure, production engineers and Technology Operations Center personnel with a primary focus on improving observability, automation, overall system health, reliability and uptime.
WhatYou’ll Be Doing
- Design, code, and maintain automation to streamline operations, reduce manual tasks, and improve system efficiency to enable a robust application environment.
- Work with observability engineers to enable actionable insights into applications and infrastructure health and performance; foster a collaborative team culture and support professional development.
- Ensure scalable & repeatable code deployments with CI/CD pipelines using Git Hub & Harness, and repeatable deployments with infrastructure as code (IaC) using Terraform.
- Build automation and operational runbooks primarily using Python scripting.
- Manage container orchestration platforms and related cloud‑native services.
- Drive reliability improvements through Service Level Objectives (SLOs), error budgets and Service Level Agreements (SLAs) aligned with business goals.
- Design & implement observability improvements using Dynatrace & Cloud Watch.
- Lead major incident responses and coordinate with stakeholders for resolution and drive problem management to prevent recurrence.
- Conduct blameless post‑incident reviews and drive continuous improvement.
- Collaborate cross‑functionally to embed SRE principles into application design and operation meeting reliability goals.
- Participate in architectural reviews, providing input on reliability and scalability.
- Experience with Dev Ops tools like Git Hub, Harness & Dynatrace.
- Experience building self‑healing systems and automated remediation workflows.
- 5+ years of experience in Site Reliability Engineering, Dev Ops, or related field.
- Proven track record of achieving high system reliability and performance.
- Strong experience with Terraform for AWS IaC.
- Proficient in scripting and automation with Python and familiar with monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack).
- Deep knowledge of container orchestration (Kubernetes/EKS).
- Deep understanding of cloud platforms (AWS, GCP, Azure) and container orchestration technologies.
- Effective communication skills, with the ability to convey complex technical concepts to diverse audiences.
- AWS certifications (Dev Ops Engineer, Solutions Architect, etc.).
- Familiarity with Git Ops, secrets management, and infrastructure monitoring best practices.
- A work environment built on teamwork, flexibility, and respect
- Professional growth and development programs to help advance your career, as well as tuition reimbursement
- Team Member Vehicle Purchase Discount
- Toyota Team Member Lease Vehicle Program (if applicable)
- Comprehensive health care and wellness plans for your entire family
- Toyota 401(k) Savings Plan featuring a company match, as well as an annual retirement contribution from Toyota…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).