AI/Machine Learning Engineer Job Mountain View area,California USA,IT/Tech

About the Role

We are looking for an Evaluation Scientist who can work across both hands-on experimentation and automation infrastructure
. This role begins with running manual evaluations (e.g., executing and monitoring individual experiments) and progresses toward building scripts, tools, and infrastructure that streamline and automate these processes, with the long-term goal of reducing manual work as much as possible.

The ideal candidate will also bring expertise in coding agents and quality evaluation
, enabling them to design robust experiments and improve workflows. While the role will receive high-level guidance, candidates should be able to independently define and implement the lower-level details of experiment setup after ramping up. For example, given a high-level requirement for a new type of evaluation, the candidate should be able to propose and execute an implementation plan with detailed steps, metrics, and automation in place.

Key Responsibilities

Run and manage manual evaluation experiments across AI/ML systems.
Develop and maintain automation infrastructure (scripts, pipelines, tools) to reduce manual evaluation work.
Design and execute new types of evaluations
, translating broad research questions into structured experiment setups.
Work with coding agents and applied ML workflows to define and measure quality.
Define metrics, benchmarks, and evaluation criteria to assess performance and identify gaps.
Collaborate with research leads to align evaluation design with project goals while owning implementation details
.
Ensure reproducibility, consistency, and scalability of evaluation processes.

Qualifications

Strong coding skills in Python (or equivalent) for scripting, automation, and experiment design.
Experience with running and analyzing experiments
, including quality evaluation methodologies.
Knowledge of coding agents, ML models, or applied automation frameworks
.
Ability to work independently
: take high-level requirements and define detailed steps for execution.
2–4 years of hands-on experience in evaluation, scripting, or applied data science/ML (academic or industry).
Strong analytical skills with experience in data handling, reporting, and experiment analysis
.

Preferred Skills

Familiarity with evaluation frameworks and automation tools in AI/ML research.
Experience in building scalable infrastructure for experiments or evaluations.
Knowledge of experimental design, statistical testing, or quality benchmarking
.

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language

AI​/Machine Learning Engineer

AI/Machine Learning Engineer