×
Register Here to Apply for Jobs or Post Jobs. X

Site Reliability Engineer

Remote / Online - Candidates ideally in
Colorado, USA
Listing for: Thinkific Labs Inc.
Remote/Work from Home position
Listed on 2026-02-28
Job specializations:
  • IT/Tech
    Cloud Computing, Systems Engineer, SRE/Site Reliability, IT Support
Salary/Wage Range or Industry Benchmark: 132900 - 182900 USD Yearly USD 132900.00 182900.00 YEAR
Job Description & How to Apply Below
Position: Staff Site Reliability Engineer

Thinkific is a learning commerce platform that helps learning businesses turn knowledge into impact. By bringing together community, courses, and content with commerce, we power transformative learning experiences that help businesses grow their revenue—and reach millions of learners around the world.

We’re a team of 300+ Thinkers building products that matter. Every role at Thinkific contributes to raising the bar for online learning, supporting learning businesses, and creating real-world impact. You’ll work alongside curious, collaborative teammates who care deeply about what they build and who they build it for.

We’re committed to a fair, inclusive, and human hiring experience. Our team is here to guide you every step of the way, so you always know what to expect!

Are you an experienced Site Reliability Engineer looking for a new challenge? We’re looking for a Staff Site Reliability Engineer to join us at Thinkific.

We’re looking for a Staff Site Reliability Engineer (SRE) to join us  a Staff Site Reliability Engineer, you will help us scale and secure the infrastructure that powers thousands of online course creators around the world.

In this role, you’ll play a critical role in improving the performance, reliability, and security of our platform. You’ll work cross-functionally with engineers, product managers, and stakeholders to drive forward reliability-focused initiatives, build scalable systems, and mentor others. You’ll also help shape our technical strategy, lead major infrastructure projects, and act as a domain expert in modern cloud-native practices, with a specific emphasis on Kubernetes, cloud infrastructure (AWS), observability, and service reliability.

Your goal will be to help guide and execute on projects related to your technical domain. Here’s how you’ll accomplish this:

  • Own one or more technical domains across our infrastructure with accountability for system reliability, performance, scalability, and security
  • Lead projects to evolve our Kubernetes-based platform, ensuring alignment with SLOs, security best practices, and long-term maintainability
  • Contribute to the design and evolution of our infrastructure using Terraform, Helm, and cloud-native tools, with an emphasis on modularity, reuse, and automation
  • Partner with engineering teams to design robust deployment pipelines, ensure operational readiness, and build secure-by-default patterns for new services
  • Lead incident response efforts and participate in on‑call rotation, driving a culture of blameless post‑mortems and learning
  • Write infrastructure and application code in Ruby, Node.js, Python, or Bash to automate operations and improve developer experience
  • Serve as a mentor and multiplier, raising the technical bar through coaching, knowledge sharing, and technical leadership
  • Actively promote observability, testing, and continuous improvement in everything you build and advocate for within your team
  • Participate in our on‑call rotation and incident response processes to help maintain a high level of service reliability

The person we have in mind likely:

  • Has 6+ years of experience in software or infrastructure engineering, including 4+ years working with Kubernetes in production environments
  • Holds a CKA certification or equivalent hands‑on Kubernetes expertise (bonus for experience managing multi‑tenant clusters or complex networking in K8s)
  • Has deep knowledge of TLS, certificates, ciphers, and encryption protocols, and can explain how they secure communications in a distributed system
  • Has production experience with AWS infrastructure and services (EKS, RDS, IAM, ALB, S3, etc.)
  • Writes infrastructure‑as‑code using Terraform, and has built scalable and secure infrastructure following modular and reusable patterns
  • Is comfortable with monitoring and observability tooling (e.g., New Relic, Datadog, Prometheus, Grafana, Sentry) and building alerting based on meaningful SLOs
  • Has experience supporting distributed systems with relational and non‑relational databases (Postgre

    SQL, AWS Aurora), message queues (Sidekiq, SNS/SQS), and asynchronous architectures
  • Enjoys collaborating across teams and helping shape engineering roadmaps and architectural…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary