Senior Database Reliability Engineer
Listed on 2026-03-01
-
IT/Tech
Data Engineer, Cloud Computing, Systems Engineer, Database Administrator
About Gridware
Gridware is a San Francisco-based technology company dedicated to protecting and enhancing the electrical grid. We pioneered a groundbreaking new class of grid management called active grid response (AGR), focused on monitoring the electrical, physical, and environmental aspects of the grid that affect reliability and safety. Gridware’s advanced Active Grid Response platform uses high-precision sensors to detect potential issues early, enabling proactive maintenance and fault mitigation.
This comprehensive approach helps improve safety, reduce outages, and ensure the grid operates efficiently. The company is backed by climate-tech and Silicon Valley investors. For more information, please visit www.
Gridware.io.
We are seeking a Database Reliability Engineer to own and maintain Gridware’s relational databases, cloud infrastructure, and streaming platforms. This role combines traditional DBA responsibilities ensuring high availability, performance, data integrity and security of databases with infrastructure ownership, including setup and management of Kafka-based streaming pipelines, Dev Ops automation, and cloud platform management.
You will work closely with Data Engineering, Site Reliability, and Dev Ops teams to proactively monitor, troubleshoot, and optimize all critical infrastructure, enabling rapid deployment of new features while ensuring reliability and data integrity.
Responsibilities- Administer, monitor, and optimize relational databases (Postgre
SQL, Amazon RDS) for performance, availability, and security. - Troubleshoot complex database and infrastructure issues, including query performance, replication, schema evolution, and event streaming pipelines.
- Maintain and support Kafka infrastructure for company-wide streaming pipelines and integration with databases.
- Implement backup, restore, and disaster recovery strategies for databases and streaming platforms.
- Collaborate with Dev Ops and Data Engineering teams to maintain CI/CD pipelines for schema, data, and infrastructure changes.
- Enforce database and infrastructure best practices, standards, and security policies.
- Proactively monitor health and performance of databases, streaming pipelines, and cloud infrastructure using Grafana, Prometheus, or equivalent.
- Contribute to Infrastructure as Code (Terraform, Ansible) for database, Kafka, and cloud infrastructure provisioning and management.
- Support internal teams during incidents or urgent troubleshooting, balancing reliability with rapid deployment needs.
- 5+ years of experience managing production relational databases and cloud infrastructure.
- Hands-on experience with Postgre
SQL, MySQL, Amazon RDS/Aurora, or similar. - Experience managing Kafka infrastructure and supporting streaming pipelines.
- Familiarity with Dev Ops practices, automation, and Infrastructure as Code (Terraform, Ansible, or similar).
- Proficiency in monitoring and observability for databases, streaming, and cloud infrastructure.
- Strong troubleshooting skills for complex, multi-layered production systems.
- Knowledge of database and infrastructure security, access control, and compliance best practices.
- Ability to collaborate across engineering, Dev Ops, and Data teams.
- Experience with analytical or No
SQL databases (Redshift, Snowflake, Dynamo
DB, Mongo
DB). - Containerized deployments and Kubernetes-based operators for databases or Kafka.
- Event-driven architecture experience and distributed system troubleshooting.
- Experience with Kafka Streams, consumer/producer tuning, and real-time pipelines.
- Infrastructure and database automation at scale.
- Health, Dental & Vision (Gold and Platinum with some providers plans fully covered)
- Paid parental leave
- Alternating day off (every other Monday)
- “Off the Grid”, a two week per year paid break for all employees.
- Commuter allowance
- Company-paid training
185000 - 205000 USD a year
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).