DevOps Engineer
Listed on 2026-01-12
-
IT/Tech
Systems Engineer, Cloud Computing
We are a rapidly growing organization focused on delivering scalable, safe, and sustainable energy storage solutions for critical infrastructure, including data centers, industrial facilities, and the grid. Our mission is to pioneer innovative technologies that enable long-duration, non-toxic energy storage systems made in the U.S.
Role OverviewWe’re seeking a Dev Ops Engineer to build and operate the cloud and on-prem infrastructure that powers our Energy Management System (EMS). You’ll use Infrastructure as Code (IaC) with Terraform and configuration management with Ansible to provision, secure, and scale services across Amazon Web Services (AWS) and Microsoft Azure (preferred). You will partner closely with backend, ML, and controls engineers to deliver reliable, observable, and cost‑efficient platforms for our applications.
You will be responsible for designing, automating, and maintaining production‑grade infrastructure; creating secure CI/CD pipelines; implementing robust observability (metrics, logs, traces); enforcing security baselines and secrets management; and driving operational excellence (SRE practices, incident response, cost optimization).
Key Responsibilities- Design, provision, and manage cloud and on-prem environments using Terraform (IaC) and Ansible (configuration management).
- Build and maintain CI/CD pipelines (e.g., Git Hub Actions, Azure Dev Ops) for services, data pipelines, and ML workloads.
- Operate containerized workloads with Docker and Kubernetes (AKS/EKS or equivalent), including cluster add‑ons, ingress, autoscaling, and upgrades.
- Implement observability: metrics (e.g., Prometheus), logging (e.g., Open Search/ELK), tracing (e.g., Open Telemetry), and actionable alerts (e.g., Grafana/Cloud Watch/Azure Monitor).
- Enforce cloud security baselines (IAM/role design, least privilege, network segmentation, TLS, key management) and manage secrets (e.g., Hashi Corp Vault, AWS Secrets Manager, Azure Key Vault).
- Automate backups, disaster recovery (RTO/RPO goals), blue/green or canary releases, and infrastructure testing (e.g., Terratest).
- Collaborate with engineers to design scalable APIs, data services, and event pipelines; champion reliability (SLOs/SLIs), performance, and cost efficiency.
- Support audits and compliance readiness (e.g., SOC 2 practices) through policy‑as‑code and strong documentation.
- Participate in on‑call/incident response, root‑cause analysis, and post‑mortems; drive continuous improvement.
- Bachelor’s degree in Computer Science, Software/Systems Engineering, or related field (or equivalent experience).
- 5+ years in Dev Ops/SRE/Platform Engineering roles operating production systems.
- Proven expertise with Terraform and Ansible in production.
- Hands‑on experience on AWS and/or Azure (both preferred) designing secure, scalable architectures.
- Strong CI/CD experience (Git Hub Actions, Azure Dev Ops, or similar) and proficiency with Docker and Kubernetes.
- Solid understanding of networking (VPC/VNet, subnets, routing, load balancers, DNS) and Linux administration.
- Experience implementing observability stacks (metrics/logs/traces) and actionable alerting.
- Strong grasp of security best practices (IAM, secrets, encryption, compliance basics) and cost optimization.
- Excellent communication; able to work independently and manage multiple initiatives.
- Experience with .NET / C# build and deployment workflows;
Python for tooling/automation is a plus. - Experience with message and data systems (e.g., Kafka, Rabbit
MQ, Postgre
SQL, SQL Server, Redis). - Familiarity with energy/industrial environments (e.g., EMS, SCADA, BMS/DERMS) is a plus.
- Experience with policy-as-code (e.g., Open Policy Agent), infrastructure testing (Terratest), and Git Ops (e.g., Argo CD/Flux).
- Knowledge of backup/DR architecture, multi‑account/multi‑subscription patterns, and cross-cloud networking.
- Contributions to internal platforms, golden paths/templates, or developer experience initiatives.
Mid-Senior level
Employment TypeFull-time
Job FunctionEngineering and Information Technology
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).