DevOps Team Leader
Listed on 2026-02-28
-
IT/Tech
SRE/Site Reliability, Cloud Computing
Cellebrite
Title: Dev Ops Team Leader
Location: Remote, VA, US
About Cellebrite:Cellebrites (Nasdaq: CLBT) mission is to enable its global customers to protect and save lives by enhancing digital investigations and intelligence gathering to accelerate justice in communities around the world. Cellebrites AI‑powered Digital Investigation Platform enables customers to lawfully access, collect, analyze and share digital evidence in legally sanctioned investigations while preserving data privacy. Thousands of public safety organizations, intelligence agencies and businesses rely on Cellebrites digital forensic and investigative solutions available via cloud, on‑premises and hybrid deployments to close cases faster and safeguard communities.
To learn more, visit us at , and find us on social media @Cellebrite.
We are looking for a highly talented and hands‑on Dev Ops Team Leader to join our team and drive infrastructure automation, CI/CD excellence, cloud scalability, and production reliability s role requires a strong technical leader with deep production ownership experience, SRE mindset, and the ability to lead a Dev Ops/SRE team responsible for mission critical systems.
What We Expect From You:Lead and manage the Dev Ops/SRE team responsible for production environments, providing technical leadership and operational excellence.
Mentor engineers and help elevate the Dev Ops/SRE teams capabilities.
Own production AWS accounts, including availability, stability, security, compliance, and cost efficiency.
Define, implement, and continuously improve SRE practices, including:- SLAs and SLOs aligned with business and customer expectations
- Error budgets to balance reliability and delivery velocity
- Production readiness reviews and reliability standards
Take a hands‑on approach to designing, developing, and modifying CI/CD pipelines, automation processes, and cloud environments.
Own and optimize CI/CD processes for production‑grade cloud environments, focusing on AWS Commercial & AWS Gov Cloud.
Design and maintain highly available, resilient, and scalable production infrastructure, with a strong emphasis on networking and AWS Landing Zone architectures.
Act as a production owner, leading:- Incident response and escalation
- Post‑incident reviews (RCA / postmortems)
- Preventative reliability improvements
Collaborate closely with development, security, compliance, and cloud engineering teams to ensure reliability is integrated throughout the SDLC.
Enforce SDLC and Dev Ops best practices, including clean code, version control, testing, and high‑quality automation.
Improve and shorten development and deployment cycles while protecting production stability through controlled rollouts, canary deployments, and rollback strategies.
Drive observability standards across production systems (monitoring, alerting, logging, metrics).
Troubleshoot and resolve complex, realtime production incidents, ensuring system stability, performance, and compliance.
What You’ll LoveAbout This Role:
The opportunity to define and embed SRE standards (SLOs, error budgets, reliability metrics) across a large‑scale platform.
Ownership of mission‑critical, high‑security production environments, including regulated cloud workloads.
Working in an environment where production reliability, security, and automation are top priorities.
High‑impact leadership role influencing architecture, availability, and delivery velocity across the organization.
Office
Location:
Vienna
- 10+ years of Dev Ops experience, with 5+ years leading a Dev Ops or SRE team responsible for production systems.
- Proven hands‑on experience owning and operating production AWS accounts, including on‑call rotations, incident management, and SLA/SLO ownership.
- Strong hands‑on experience with:
- CI/CD tools (Git Hub Actions, Artifactory (JFROG), ECR)
- Containerization (Docker, Docker Compose, Kubernetes)
- Cloud platforms (AWS & AWS Gov – must)
- Linux administration, IaC, and automation
- Scripting and/or languages such as Bash, Python, NodeJS, or similar languages.
- Networking concepts and AWS Landing Zone architectures
- Strong understanding of Site Reliability Engineering concepts, including monitoring strategies, alert fatigue reduction, and reliability KPIs.
- Experience building production‑safe CI/CD pipelines with quality gates and automated validations.
- Software development experience with a focus on automation, maintainability, and quality.
- Experience working in Agile/Scrum teams.
- A proactive leader with a production‑first and reliability‑driven mindset.
- Experience with FedRAMP, IRAP, or other regulated cloud compliance frameworks.
- Exposure to highly regulated, air‑gapped or government cloud environments.
- Experience defining enterprise‑level reliability metrics and reporting to leadership.
- Familiarity with cost optimization practices (Fin Ops) in production environments.
Equal employment opportunity, including veterans and individuals with disabilities.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).