×
Register Here to Apply for Jobs or Post Jobs. X

Senior SRE

Job in San Francisco, San Francisco County, California, 94199, USA
Listing for: Pylon
Full Time position
Listed on 2026-03-01
Job specializations:
  • IT/Tech
    SRE/Site Reliability, Systems Engineer, Cloud Computing
Salary/Wage Range or Industry Benchmark: 125000 - 150000 USD Yearly USD 125000.00 150000.00 YEAR
Job Description & How to Apply Below

At Pylon, we’re a small team building a very ambitious product in the mortgage space.

At this early stage, we’re looking for engineers who can see the opportunity of what we’re building toward and want to have a hand in building it.

We’re in search of people who find difficult problems invigorating and who fit well into a high‑performing team built on mutual respect and reliance. If you like pushing yourself to learn a massive amount while shipping code that has a huge impact on the end product, Pylon Engineering could be a great place for you.

About the Job

You’ll own reliability and operational excellence for Pylon’s production systems. This means designing and implementing monitoring, alerting, and incident response processes that scale as we grow. You’ll build tooling that makes the entire engineering team more effective, establish on‑call rotations and runbooks, and ensure our platform can handle the demands of a regulated, high‑stakes financial product.

This is not a pure ops role. At Pylon, we believe SRE work should be a maximum of 50 % operational toil. If you’re spending more than half your time firefighting and keeping things running, you’re not doing SRE work, you’re doing sysadmin work. The other 50 %+ of your time should be spent writing code: building infrastructure tooling, automating away operational burden, making reliability improvements to core services, and creating internal developer productivity tools that make the entire team more effective.

SRE is about making things better, not just keeping them alive.

What We’re Looking For Must‑haves
  • 4+ years experience in SRE, infrastructure, or platform engineering roles
  • Experience working on a team of SREs at a company with mature SRE practices (not solo SRE roles)
  • Real on‑call experience at scale in a large production environment (you’ve carried the pager and lived through incidents)
  • Deep AWS expertise (ECS, RDS, networking, security)
  • Strong experience with declarative infrastructure (Terraform, CDK, or similar)
  • Nix experience (we use it and want to expand its adoption)
  • Track record of building reliability tooling and automation
  • Can design and implement monitoring, alerting, and observability systems from first principles
  • Comfortable working in a regulated environment where "breaking things" is not an option
Nice‑to‑haves
  • Experience at companies with strong SRE cultures (Google, Replit, Stripe, etc.)
  • Background in fintech, healthtech, or other regulated domains
  • Experience migrating monitoring systems or implementing SLOs
  • Contributions to infrastructure tooling or open source projects
Basics
  • Job title:

    Senior Site Reliability Engineer
  • Stock options: own a piece of the company and we all win together
  • Health insurance, 401K, dental, etc.
Our Technology Stack
  • Infrastructure: AWS (ECS, RDS, Cloud Front, Lambda), CDK for infrastructure-as-code
  • Observability:
    Honeycomb, Open Telemetry
  • CI/CD:
    Git Hub Actions, Nix for builds and dev environments
  • Core platform:
    Type Script/Node backend, Postgre

    SQL, React frontend
  • Languages:

    Type Script, Python, Nix, SQL
About You

You:

Have operated production systems at scale.

You’ve been on‑call for a large, complex system. You know what 3 am pages feel like and you’ve built systems to prevent them. You understand the difference between alerts that matter and noise.

Write code, not just YAML.

You can build internal tools, automation, and reliability improvements. You’re comfortable contributing to the core product when reliability requires it. You can read and understand the codebase you’re responsible for keeping up.

Think in systems.

You understand distributed systems, failure modes, cascading failures, and graceful degradation. You can diagnose production issues quickly and know when to escalate vs. when to fix.

Know your tools deeply.

You’ve used observability platforms at scale and understand how to instrument systems properly. You can design alerting that has high signal and low noise. You know AWS inside and out.

Have strong opinions that you’re willing to defend.

We have a culture of vigorous discussion and debate on technical decisions. We’ll push you to defend your choices, and we want you to push back.

Don’t settle.

Challenge yourself…

Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary