DevOps Engineer Job New York New York USA,IT/Tech

Location: New York

Base Pay Range

$/yr - $/yr

Company Overview

Dune Security is the world’s first User Adaptive Risk Management solution. Powered by AI, we quantify employee risk with comprehensive data and automatically deliver user‑adaptive training and intervention. For higher‑risk users, our platform integrates seamlessly with the broader security stack to dynamically implement controls. Backed by Craft Ventures, Toba Capital, Mass Mutual Ventures, Alumni Ventures, Fire streak Ventures, and Antler, we empower CISOs to proactively manage user risk – the leading cause of cybersecurity breaches – and build safer, more resilient organizations.

Role Overview

Dune Security is seeking a Senior Dev Ops Engineer to own and operate highly reliable, scalable, production‑grade infrastructure and developer platforms. This role carries direct responsibility for availability, deployment safety, incident response, platform design, and long‑term infrastructure quality for customer‑facing web systems.

The ideal candidate has operated web platforms at or above 99.999% availability, responds decisively to live production incidents, and proactively designs and improves systems to reduce operational risk and accumulated technical debt.

Key Responsibilities

Own production reliability for customer‑facing web platforms
, with demonstrated experience meeting (e.g. 99.9%+ uptime SLOs)
Serve as an on‑call escalation owner for P0/P1 incidents, driving rapid mitigation and high‑quality post‑incident analysis
Proactively design resilient systems to eliminate classes of failures before they occur
Maintain deployment safety
, including low change‑failure rates and zero manual production changes outside emergency procedures
Build and operate infrastructure via infrastructure‑as‑code (Terraform), minimizing manual toil
Continuously reduce infrastructure and platform technical debt
, including refactoring brittle systems, improving automation coverage, and simplifying operational complexity
Operate and scale identity and access platforms (e.g., Keycloak), enforcing MFA and production access hygiene
Design and operate production compute environments spanning CPU, GPU, TPU, and FPGA workloads
Design and operate AI‑capable infrastructure
, including model serving and batch or real‑time inference pipelines
Partner with application engineers to improve service operability, resilience, deployment safety, and long‑term maintainability
Document systems, incidents, and decisions clearly using Jira and Confluence

Qualifications & Experience

5+ years of Dev Ops or SRE experience owning real production systems, including customer‑facing web platforms operating at ≥99.999% uptime
Proven ownership of incident response, MTTR, and reliability metrics
, with hands‑on responsibility during live P0/P1 incidents
Strong experience designing and operating AWS cloud infrastructure across multiple environments
, using Terraform, Docker
, and infrastructure‑as‑code practices
Deep CI/CD and release engineering experience with Git Lab CI/CD and Jenkins
, including safe, automated production deployments
Advanced Linux systems administration and Linux internals; kernel‑level tuning and performance optimization preferred
Experience designing and operating AI and data platforms in production
, including Airflow, Databricks, Snowflake
, and AI deployment environments
Experience operating and scaling heterogeneous compute environments
, including CPU, GPU, TPU, FPGA, and QPU workloads
Experience operating and scaling Mongo

DB Atlas in production environments
Strong experience with observability and analytics tooling
, including Grafana, Splunk, Elasticsearch / Open Search, Kibana
, and Open Telemetry
, and using telemetry directly during incident response
Experience operating identity and access platforms such as Keycloak
, and working with cloud security and posture tooling including AWS Guard Duty and Wiz
Hands‑on experience using Jira for incident and work tracking and Confluence for operational and architectural documentation
Bachelor’s degree in Computer Science, Computer Engineering, or a related field (required)
Cloud or Dev Ops certifications preferred

What You’ll Bring

We are looking for a proactive and…


Increase/decrease your Search Radius (miles)



Job Posting Language