Senior Site Reliability Engineer Job Plano area,Texas USA,IT/Tech

This range is provided by Optomi. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.

Base pay range

$60.00/hr - $72.00/hr

Direct message the job poster from Optomi

Site Reliability Engineer – Applications & Domains (Sr. / Lead)

Optomi is seeking a Site Reliability Engineer (SRE) to join a newly formed SRE organization supporting application domains. This role will focus on improving application reliability, performance, availability, and observability across critical platforms. The ideal candidate brings strong automation skills, cloud-native experience, and the ability to partner closely with development, infrastructure, and operations teams to embed SRE best practices across the application lifecycle.

This opportunity requires four days onsite per week in Plano, TX.

What the Right Candidate Will Enjoy!

Building a new SRE function from the ground up within a large enterprise environment!
Driving application reliability and performance across mission-critical domains!
Partnering closely with application development, platform, and operations teams!
Designing and implementing automation to reduce toil and improve system health!
Working with modern CI/CD pipelines and cloud-native infrastructure!
Leading major incident response and driving long-term reliability improvements!
Influencing architecture decisions with a focus on scalability and resilience!

Experience of the Right Candidate:

6-7 years of experience for Senior SRE (JL17) or 8-10 years for Lead SRE (JL18).
Strong experience regardless of roles in Site Reliability Engineering, Dev Ops, or Production Engineering.
Hands‑on experience with CI/CD pipelines using Git Hub and Harness.
Strong experience building infrastructure as code using Terraform (AWS preferred).
Proficiency in Python scripting for automation, tooling, and operational runbooks curves.
Experience managing container orchestration platforms such as Kubernetes / EKS.
Deep understanding of cloud platforms (AWS preferred; GCP or Azure acceptable).
Experience designing and improving observability using tools such as Dynatrace and Cloud Watch.
Strong knowledge of SRE concepts including SLOs, SLAs, error budgets, and reliability metrics.
Excellent communication skills with the ability to collaborate across technical and business teams.

Responsibilities of the Right Candidate:

Design, build,ത്തിലുള്ള automation to streamline operations and reduce manual effort.
Partner with observability engineers to provide actionable insights into system health and performance.
Ensure scalable, repeatable application deployments using CI/CD and infrastructure as code.ҟьаны möchte
Develop automation and operational tooling primarily using Python.
Manage and support containerized application environments and cloud-native services.
Define, implement, and track SLOs, SLAs, and error budgets aligned to business objectives.
Design and implement monitoring and observability enhancements using Dynatrace and Cloud Watch.
Lead major incident response efforts and coordinate resolution with key stakeholders.
Conduct blameless post‑incident reviews and drive corrective and preventative actions.
Collaborate cross‑functionally to embed SRE principles into application design and delivery.
Participate in architecture reviews with a focus on reliability, scalability, and resilience.

Preferred Qualifications:

AWS certifications (Dev Ops Engineer, Solutions Architect, or similar).
Experience with Git Ops practices, secrets management, and infrastructure monitoring best practices.
Experience building self‑healing systems and automated remediation workflows.

Seniority level

Mid‑Senior level

Employment type

Full‑timeាព>

Job function

IT Services and IT Consulting

Referrals increase your chances of interviewing at Optomi by 2x

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language