×
Register Here to Apply for Jobs or Post Jobs. X

Sr. Engineering Manager, AI Evaluation Platform

Job in Seattle, King County, Washington, 98127, USA
Listing for: Apple Inc.
Full Time position
Listed on 2026-01-12
Job specializations:
  • Software Development
    Software Engineer, AI Engineer, Cloud Engineer - Software, DevOps
Salary/Wage Range or Industry Benchmark: 216600 - 325500 USD Yearly USD 216600.00 325500.00 YEAR
Job Description & How to Apply Below

Seattle, Washington, United States Software and Services

Join Apple Services Engineering to build the next generation of AI evaluation systems. We are seeking a hands‑on Engineering Manager to architect high‑availability services and internal tools that enable self‑service evaluation  will partner with researchers to operationalize their innovations, transforming complex workflows into intuitive, developer‑first platforms. We are looking for a leader who thrives in the ambiguity of new initiatives and is passionate about building scalable infrastructure.

Description

You will build and lead the engineering team responsible for democratizing AI evaluation across the organization. Your focus will be on architecting the developer experience—designing the APIs, SDKs, and platform services that turn complex evaluation metrics into simple, self‑service calls. You will work hand‑in‑hand with researchers to operationalize sophisticated measurement techniques, ensuring they scale reliably within our high‑availability infrastructure. In this role, you will define the engineering standards for a new organization, establishing the code quality, automation, and testing rigor required to support the rapid evolution of Generative AI and Agentic systems.

Responsibilities
  • Team Building & Leadership:
    Hire, mentor, and grow a diverse, high‑performing team of backend and platform engineers. Foster a culture of technical excellence and rapid delivery as you build this new team from the ground up.
  • Technical Strategy &

    Roadmap:

    Own the engineering roadmap for the core evaluation engine. Architect the APIs, SDKs, and distributed services that power our internal platform, enabling product teams to measure Generative AI performance autonomously.
  • Operationalizing Science:
    Partner closely with Applied Scientists to translate novel metrics, judge prompts, and scoring algorithms into scalable, production‑grade services. Create frameworks to evaluate not just simple responses, but also multi‑turn agent trajectories and tool usage.
  • System Integration:
    Serve as a technical bridge between the research organization and the broader engineering ecosystem, ensuring our tools integrate seamlessly with existing ML infrastructure and developer workflows.
  • Engineering Rigor:
    Establish the software development lifecycle (SDLC) for the team, defining standards for code quality, automated testing (CI/CD), and monitoring to ensure high availability and reliability.
Minimum Qualifications
  • 5+ years of direct engineering management experience, with a proven track record of hiring, mentoring, and retaining high‑performing engineers. You have successfully managed teams that ship production‑grade software.
  • 7+ years of hands‑on software engineering experience with deep proficiency in the Python ecosystem (e.g., FastAPI, Pydantic, Pandas). You are capable of contributing to code reviews and architectural discussions on day one.
  • Customer Obsession & Product Thinking:
    Experience acting as a technical partner to internal customers. You can translate vague requirements from other teams into concrete engineering specifications and are comfortable prioritizing the roadmap in the absence of a dedicated Product Manager.
  • Demonstrated experience partnering with Applied Scientists or Researchers:
    You have the ability to navigate the ambiguity of research workflows and operationalize scientific code.
  • Functional literacy in AI/ML concepts:
    You understand the fundamental lifecycle of machine learning (datasets, training vs. inference, evaluation metrics) and can discuss the engineering challenges involved in serving models.
  • Strong expertise in API Design & Internal Tools:
    You have architected APIs that other developers rely on, with a focus on versioning, backward compatibility, and developer experience.
  • Operational excellence background:
    You have practical experience establishing CI/CD pipelines, containerization (Docker/Kubernetes), and monitoring (Datadog/Prometheus).
Preferred Qualifications
  • Experience building MLOps & Platform Infrastructure:
    You have architected or managed teams that built the foundational infrastructure for AI, such as model registries, inference…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary