Distributed Systems Engineer
Listed on 2026-01-12
-
Software Development
Software Engineer, DevOps
💻
Languages
: 5+ years of experience in Go, Terraform
✅
Skills
:
Go, Building and managing large clusters, Linux, Networking, Kubernetes, Virtualization
Who we are
E2B is a fast growing Series A startup with 7-figure revenue
. We've raised over $32M in total since our funding in 2023 and are supported by great investors like Insight Partners
. Our customers are companies like Perplexity, Hugging Face, Manus, or Groq. We're building the next hyperscaler for AI agents.
About the role
You will be building the next cloud platform for running AI software - a cloud where AI apps are building other software apps.
Your job will be:
We’re looking for an infrastructure engineer passionate about making things run fast and efficiently, and running A LOT of them at the same time.
If you aren’t afraid of going into the kernel of a VM and words like Firecracker, eBPF, UFFD, block device, L4 load balancing, noisy neighbor problem, or hugepages sound exciting to you, we want to hear from you!
👉What we're looking for
- 7+ years building distributed systems - You've operated infrastructure at serious scale (100K+ RPS, multi-region, PB-scale data) and understand the trade-offs between consistency, availability, and partition tolerance in practice, not just theory
- Deep Linux internals expertise - You're comfortable working at the kernel level. You've debugged performance issues using eBPF, understand CPU scheduling, memory management, and can explain the difference between cgroups v1 and v2 without looking it up
- VM hypervisor experience - You've worked with Firecracker, QEMU, KVM, or similar. You understand virtio, know what a hypercall is, and have opinions about nested virtualization trade-offs
- Systems programming skills - Strong in at least one of:
Go, Rust, C/C++. You've written performance-critical code and know when to reach for lock-free data structures, memory-mapped files, or - Production orchestration experience - You've built or operated orchestration systems (Kubernetes, Nomad, or custom). You understand bin-packing algorithms, resource scheduling, and have dealt with noisy neighbor problems at scale
- Performance obsession - You've shaved milliseconds off hot paths, understand CPU caches and memory locality, and have profiled production systems under load. You know what "p99 latency" means and care deeply about making it better
- Networking expertise - Strong understanding of L4/L7 load balancing, network name spaces, iptables/nftables, and how to build secure, isolated network topologies for multi-tenant systems
- Located in San Francisco or willing to relocate - We work in person as a team and believe in the magic that happens when engineers collaborate face-to-face on hard problems
- Excited about open source - Comfortable with our code and infrastructure being public. You contribute to discussions, write clear documentation, and help the community succeed with self-hosting
Bonus points for:
- Experience with userfaultfd (UFFD), copy-on-write mechanisms, or lazy loading
- GPU passthrough or PCIe device virtualization experience
- Built or maintained infrastructure for AI/ML workloads
- Contributions to Firecracker, Cloud Hypervisor, or similar open source projects
- Experience with observability at scale (distributed tracing, kernel-level metrics)
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).