MTS, Developer
Listed on 2026-01-17
-
Software Development
AI Engineer, Machine Learning/ ML Engineer
Job | Services LLC - A57
Are you interested in a unique opportunity to advance the accuracy and efficiency of Artificial General Intelligence (AGI) systems? If so, you're at the right place! We are the AGI Autonomy organization, and we are looking for a driven and talented Member of Technical Staff to join us to build state‑of‑the‑art agents. Our lab is a small, talent‑dense team with the resources and scale of Amazon.
Each team in the lab has the autonomy to move fast and the long‑term commitment to pursue high‑risk, high‑payoff research. We’re entering an exciting new era where agents can redefine what AI makes possible. We’d love for you to join our lab and build it from the ground up!
- Design and implement a modern, fast, and ergonomic development environment for AI researchers, eliminating current pain points in build times, testing workflows, and iteration speed.
- Build and manage CI/CD pipelines (Code Pipeline, Jenkins, etc.) that support large‑scale AI research workflows, including pipelines capable of orchestrating thousands of simultaneous agentic experiments.
- Develop tooling that bridges local development environments with remote supercomputing resources, enabling researchers to seamlessly leverage massive compute from their IDEs.
- Manage and optimize code repository infrastructure (Git Lab, Phabricator, or similar) to support collaborative research at scale.
- Implement release management processes and automation to ensure reliable, repeatable deployments of research code and models.
- Optimize container build systems for GPU workloads, ensuring fast iteration cycles and efficient resource utilization.
- Work directly with researchers to understand workflow pain points and translate them into infrastructure improvements.
- Build monitoring and observability into development tooling to identify bottlenecks and continuously improve developer experience.
- Design and maintain build systems optimized for ML frameworks, CUDA code, and distributed training workloads.
The team is shaping developer experience from the ground up. Building tools that enable researchers to move at the speed of thought: IDEs that seamlessly shell out to supercomputers, CI/CD pipelines that orchestrate thousands of agentic commands simultaneously, and build systems optimized for GPU‑accelerated workflows. Your infrastructure will be the foundation that enables the next generation of AI research, directly contributing to our mission of building the most capable agents in the world.
BasicQualifications
- 5+ years of experience in Dev Ops, release engineering, or developer tools/infrastructure.
- Expertise with shell scripting and command‑line tools (bash, zsh, etc.).
- Experience managing CI/CD systems such as AWS Code Pipeline, Jenkins, Circle
CI, or similar platforms. - Hands‑on experience managing code repositories and version control systems (Git Lab, Git Hub, Phabricator, etc.).
- Proficiency in at least one programming language (Python, Go, Rust, or similar) for automation and tooling development.
- Experience building and maintaining developer tooling or infrastructure at scale.
- Understanding of containerization (Docker, containerd) and container orchestration.
- Experience with release management and maintaining large‑scale software deployments.
- Knowledge of container build internals (Docker multi‑stage builds, Build Kit, layer caching optimization).
- Experience working with GPU infrastructure and CUDA development workflows.
- Background in IDE development or customization (VSCode extensions, Jet Brains plugins, etc.).
- Experience building development tools for machine learning or data science teams.
- Knowledge of ML frameworks (PyTorch, Tensor Flow) and their build/dependency requirements.
- Experience with AWS developer tools and services (Code Build, Code Deploy, Code Commit, etc.).
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Los Angeles County applicants:
Job duties for this position include: work safely and cooperatively with other employees, supervisors, and staff; adhere to standards…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).