Senior DevOps Engineer
UK
Listed on 2025-11-30
-
IT/Tech
Cloud Computing, SRE/Site Reliability, Systems Engineer
Senior Dev Ops Engineer
Department: Technology
Employment Type: Full Time
Location: UK (Home based)
DescriptionAbout us
We are Digital Science and we are advancing the research ecosystem.
We are a pioneering technology company, and our vision is of a future where a trusted and collaborative research ecosystem drives progress for all. We believe in better, open, collaborative and inclusive research. In creating the next generation of tools and working in partnership with the community we tackle some of the biggest challenges to research. In order to achieve our vision, we need innovative, inspiring and dynamic people to join our team.
Want to join us?
Senior Dev Ops Engineer, focusing on Overleaf Infrastructure
We are recruiting for a Senior Dev Ops Engineer within the wider Digital Science Product organization, where you will directly support one of our most critical and high-profile products:
Overleaf.
We're looking for a talented Senior Dev Ops Engineer to join our team and help us maintain the reliability, scalability, and performance of the systems that power Overleaf’s most critical platforms. Operating primarily on Google Cloud (GCP), you will use your knowledge of distributed systems and architecture to ensure smooth, global operations and improve overall system health.
You’ll work closely with cross-functional teams to identify and mitigate risks, supporting platforms that require world‑class reliability and automation.
What you’ll be doingThis role requires a blend of hands‑on infrastructure ownership, automation, and a strong focus on system reliability and cost efficiency.
- GCP Infrastructure Ownership: You will own our infrastructure on Google Cloud Platform and the Terraform codebase
, managing critical components including VPCs, Compute Engine, Kubernetes Clusters, Cloud SQL/Redis, Load balancers, Cloud Armor, logging/monitoring pipelines, and IAM. - Automation & CI/CD: Build and optimize CI/CD pipelines using Jenkins or similar tools, and automate routine operations with shell scripts where appropriate.
- Reliability & Monitoring: Implement and manage monitoring, alerting, and incident response systems using Google Cloud Monitoring and similar tools. You will be part of a rotating on‑call schedule for critical infrastructure issues outside normal business hours.
- Database Management: Ensure the performance, reliability, and uptime of Postgre
SQL and Mongo databases with proactive monitoring and tuning. - Cost Management: Oversee resource usage on GCP to ensure we are managing our costs efficiently
. - Collaboration & Knowledge Sharing: Take a collegiate approach to sharing knowledge with engineers, building consensus for change, and writing excellent documentation
.
- Cloud & Containers: Significant working knowledge of cloud‑computing environments such as GCP or AWS
. Strong hands‑on expertise in Kubernetes and Docker
. - Infrastructure as Code (IaC): Strong hands‑on expertise in Terraform
. - Operating Systems & Scripting: Solid Linux/Unix systems knowledge and scripting skills (
Bash/Python
). - Dev Ops Tooling:
Experience with
CI/CD tools (e.g., Jenkins) and monitoring platforms (e.g., Grafana, Google Cloud Monitoring). - Database Expertise: Experience working with databases such as Mongo
, PostgreSQL
, and Redis
. - SRE Practice: Know how to implement best‑practice alerting, monitoring, and observability on applications that experience high load.
- Incident Management: An excellent track record of dealing with production incidents and post‑incident analysis
. - Agile: Significant experience working in an Agile methodology and implementing best practices in version control and code review.
- A security‑first mindset at all times, covering confidentiality, integrity, and availability.
- A commitment to staying up‑to‑date with emerging technologies and implementing innovative cloud solutions.
- Understand error budgets, SLI, and SLOs
. - Understand how to manage cloud computing costs effectively
. - Experience coding in a language such as Java Script .
Don't worry if you don’t meet every qualification—let us be the judge! Studies show that many qualified candidates from…
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: