Vision Language Action; VLA models engineer
Listed on 2026-01-24
-
IT/Tech
Robotics, AI Engineer, Machine Learning/ ML Engineer -
Engineering
Robotics, AI Engineer
Our mission is to create advanced robots that can operate in complex environments, reducing human risk in conflict zones and enhancing efficiency in labor-intensive industries.
We are on the lookout for extraordinary engineers and scientists to join our team. Your previous experience in robotics isn't a prerequisite — it's your talent and determination that truly count.
We expect that many of our team members will bring diverse perspectives from various industries and fields. We are looking for individuals with a proven record of exceptional ability and a history of creating things that work.
Our CultureWe like to be frank and honest about who we are, so that people can decide for themselves if this is a culture they resonate with. Please read more about our culture here (Use the "Apply for this Job" box below)..
Who should join:- You like working in person with a team in San Francisco.
- You deeply believe that this is the most important mission for humanity and needs to happen yesterday.
- You are highly technical - regardless of the role you are in. We are building technology; you need to understand technology well.
- You care about aesthetics and design inside out. If it's not the best product ever, it bothers you, and you need to “fix” it.
- You don't need someone to motivate you; you get things done.
- Develop and optimize vision-language-action models, including transformers, diffusion models, and multimodal encoders/decoders.
- Build representations for 2D/3D perception, affordances, scene understanding, and spatial reasoning.
- Integrate LLM-based reasoning with action planning and control policies.
- Design datasets for multimodal learning: video-action trajectories, instruction following, teleoperation data, and synthetic data.
- Interface VLAM outputs with real-time robot control stacks (navigation, manipulation, locomotion).
- Implement grounding layers that convert natural language instructions into symbolic, geometric, or skill-level action plans.
- Deploy models on on-board or edge compute platforms, optimizing for latency, safety, and reliability.
- Build scalable pipelines for ingesting, labeling, and generating multimodal training data.
- Create simulation-to-real (Sim2
Real) training workflows using synthetic environments and teleoperated demonstration data. - Optimize training pipelines, model parallelism, and evaluation frameworks.
- Work closely with robotics, hardware, controls, and safety teams to ensure model outputs are executable, safe, and predictable.
- Collaborate with product teams to define robot capabilities and user-facing behaviors.
- Participate in user and field testing to iterate on real-world performance.
- Strong experience with training multimodal models, including VLAs, VLMs, vision transformers, LLMs.
- Ability to build and iterate on large-scale training pipelines.
- Deep proficiency in PyTorch or JAX, distributed training, and GPU acceleration.
- Strong software engineering skills in Python and modern ML tooling.
- Experience with (synthetic) dataset creation and curation.
- Understanding of real-time deployment constraints on embedded hardware.
- Optimally, familiarity with robotics simulation environments (Isaac Lab, Mujoco, or similar).
- Ideally, hands-on experience with robotics, embodied AI, or reinforcement/imitation learning.
- MSc or PhD in Computer Science, Robotics, Machine Learning, or related field—or equivalent industry experience.
We provide market standard benefits (health, vision, dental, 401k, etc.). Join us for the culture and the mission, not for the benefits.
The annual compensation is expected to be between $80,000 - $1,000,000. Exact compensation may vary based on skills, experience, and location.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).