Software Engineer, Distributed Systems; Backend Core
Listed on 2026-02-28
-
Software Development
Backend Developer, Software Engineer, Cloud Engineer - Software, DevOps
Location: New York
Staff Software Engineer, Distributed Systems (Backend Core)
Base pay range: $/yr - $/yr
Additional compensation:
Stock options.
My client is a fast-growing AI company developing advanced autonomous systems tailored for complex, high‑stakes domains such as healthcare, legal, and finance. Their core focus is building AI agents that combine deep technical rigor with human‑like decision‑making, enabling organizations to delegate sophisticated workflows with confidence. They have built a novel architecture that addresses core challenges around trust and reliability in AI, aiming to make intelligent agents a foundational part of the modern economy.
The RoleWe’re looking for a systems‑level backend engineer who builds distributed systems from first principles — not just using frameworks, but designing and implementing the core building blocks that power reliable computation ’ll architect and implement the runtime foundation that executes and supervises millions of concurrent AI processes across globally distributed environments.
What You’ll Build- Core distributed services handling concurrency, coordination, and state management for the agent runtime.
- Custom messaging, replication, and scheduling mechanisms ensuring consistency and fault tolerance across nodes.
- Low‑latency data and metadata stores optimized for high‑throughput transactional workloads.
- Concurrency and synchronization primitives that make distributed execution predictable and safe.
- Observability and recovery mechanisms that provide deterministic replay and forensic auditing of AI decisions.
- Systems for high‑availability deployment, cluster membership, and leader election without external dependencies.
- 5+ years of experience building distributed systems, databases, or runtime infrastructure.
- Deep understanding of concurrency, consensus, replication, and durability — and ability to implement them in code.
- Strong background in C++, Rust, or Go with emphasis on memory management, performance tuning, and correctness.
- Experience designing internal systems such as queues, KV stores, schedulers, caching layers, or distributed file systems.
- Comfort working close to the metal (threads, sockets, async I/O, persistence layers).
- Proven ability to reason about consistency models, CAP tradeoffs, and system in variants.
- Familiarity with observability and debugging of distributed systems in production.
- Curiosity for elegant, minimal designs and an instinct for measuring before optimizing.
- Research or open‑source contributions in distributed systems (e.g., databases, OS kernels, or storage engines).
- Experience with Raft, Paxos, gRPC internals, or custom RPC frameworks.
- Prior work on systems such as Kafka, TiKV, Cockroach
DB, etc. - Exposure to regulated or safety‑critical environments (finance, healthcare, aerospace).
- Medical insurance.
- Vision insurance.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).