×
Register Here to Apply for Jobs or Post Jobs. X

Database Reliability Engineer

Job in San Francisco, San Francisco County, California, 94199, USA
Listing for: WorkOS
Full Time position
Listed on 2026-03-04
Job specializations:
  • IT/Tech
    Cloud Computing, SRE/Site Reliability
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below

About WorkOS 🚀 WorkOS builds tools and services for developers to help them implement authentication, identity, authorization, and overall enterprise readiness. We’re a fully distributed team with employees across North American time zones. We’re well‑funded, having raised $100m in funding from top investors including Greenoaks Capital, Lachy Groom, and Lightspeed Ventures. Our fast‑growing customer base includes rapidly growing SaaS companies like OpenAI, Cursor, Perplexity, Vercel, Plaid, and hundreds of others.

About

The Infrastructure Team

The Infrastructure team ensures the WorkOS platform remains fast, reliable, and resilient  build the systems and practices that keep everything running smoothly—handling hundreds of millions of requests, minimizing downtime, and continuously improving service performance. Our team works across the stack and collaborates closely with product engineering teams.

The Role

As a Database Reliability Engineer on this team, you'll bring specialized database expertise to the Infrastructure organization. You'll own the full lifecycle of database management, from design and capacity planning through performance optimization and disaster recovery, ensuring data durability and scalability as WorkOS grows.

What You'll Do
  • Own the reliability, performance, and scalability of WorkOS's Postgre

    SQL infrastructure.
  • Analyze and implement best practices for our database clusters, including replication, connection pooling, high availability, and disaster recovery.
  • Build and maintain observability for database metrics (query performance, replication lag, connection saturation, storage growth) and ensure we meet our database SLOs.
  • Provide database expertise to product engineering teams through migration reviews, query optimization guidance, and schema design consultation.
  • Develop automation and self‑service tooling that enables engineers to safely interact with databases without bottlenecking on the DBRE team.
  • Participate in on‑call rotations and lead incident response for database‑related production issues, performing root cause analysis and implementing permanent fixes.
  • Plan and manage database capacity, forecasting growth and ensuring our infrastructure can handle increased workloads.
  • Collaborate with SREs to roll out infrastructure changes to production environments, with a focus on minimizing risk to the data layer.
  • Document operational procedures, runbooks, and architectural decisions so learnings become repeatable actions and eventually automation.
  • Drive improvements to backup and recovery strategies, regularly testing and validating disaster recovery procedures.
About You
  • 5+ years of experience running Postgre

    SQL in production at scale, with strong knowledge of internals (WAL, MVCC, vacuum tuning, query planner, indexing, replication).
  • Solid software engineering skills. You write production‑quality code, not just scripts. Experience with Python, Go, Ruby, or similar languages.
  • Experience with infrastructure‑as‑code and configuration management (Terraform, Ansible, Chef, or similar).
  • Strong SQL skills and the ability to review and optimize complex queries for high‑throughput, low‑latency environments.
  • Experience with database high‑availability patterns: streaming replication, connection pooling (PgBouncer), failover automation (Patroni or similar).
  • Familiarity with cloud database services on AWS (RDS, Aurora, Dynamo

    DB, Elasti Cache) or equivalent platforms.
  • Experience with monitoring and observability tools (Datadog, Prometheus, Grafana, or similar) applied to database workloads.
  • Comfort with on‑call responsibilities and a track record of effective incident response.
  • Strong written and verbal communication skills. You document your work and share context proactively.
  • A proactive, ownership‑driven mindset. When you see something broken, you fix it. When you see a pattern of toil, you automate it.
Nice to Have
  • Experience with other data stores beyond Postgre

    SQL (Redis, Dynamo

    DB, Click House, Elasticsearch).
  • Familiarity with Ruby on Rails or Django and how ORMs interact with the database layer.
  • Experience with database migration tooling and blue‑green or zero‑downtime migration strategies.
  • Contributio…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary