×
Register Here to Apply for Jobs or Post Jobs. X

Evaluations - Platform Engineer

Job in New York, New York County, New York, 10261, USA
Listing for: Arbitrum
Full Time position
Listed on 2026-03-01
Job specializations:
  • IT/Tech
    AI Engineer, Data Scientist, Machine Learning/ ML Engineer, Data Analyst
Salary/Wage Range or Industry Benchmark: 275000 - 325000 USD Yearly USD 275000.00 325000.00 YEAR
Job Description & How to Apply Below
Location: New York

Location

HQ - NYC

Employment Type

Full time

Location Type

On-site

Department

Engineering

Compensation
  • Senior Base Salary $225K – $275K
    • Offers Equity
  • Staff Base Salary $275K – $325K
    • Offers Equity

We’re looking for a Platform Engineer, Applied Evaluations to define and operationalize quality for the agentic systems that power Antimetal’s investigation and automation engine.

This role is core to our product. You’ll own online and offline evaluation pipelines that operate over petabytes of infrastructure data, and shape agent platform abstractions where necessary to ensure our agents are measurable, debuggable, and reliable. You’ll partner closely with platform, product, and research, leveraging quality signals to accelerate iteration across the company.

About Antimetal

Antimetal is building the future of infrastructure management. We're starting by creating a platform that investigates, resolves, and prevents issues—giving engineers their time back to focus on what they do best: building great products.

What you’ll do:
  • Own the evaluation stack:
    Build online and offline eval pipelines that measure agent quality across epic, voluminous MELT data, code, and unstructured docs. Set the metrics that define the experience.
  • Define quality at scale:
    Production incidents span hundreds of services—ephemeral, high-volume, and where ground truth is approximative. Design evals that capture trajectory quality, not just final outputs, and validate that your metrics predict real outcomes.
  • Build platform abstractions for agents:
    Design core agent architectures and extend internal frameworks (e.g. sub-agents, MCPs, middleware) – that lets product, platform, and research iterate with confidence and ship faster.
  • Productionize:
    Own latency, observability, and uptime.
What you do:
  • At least 3 years of experience in ML platform engineering, data engineering, or a related role, preferably at a high-growth company.
  • Prior experience designing evaluation systems where ground truth is noisy, high-volume, and hard to label (e.g. computer vision, deep research pipelines).
  • Strong system design skills: you think about how data flows through distributed systems and how decisions compound at scale.
  • Proven ability to write clean, scalable code and strong data modeling skills.
  • Demonstrated ability to bring ambiguous goals from prototype to production, using data and experimentation to drive product and architectural decisions.
  • Proficient in Python and Typescript, with experience using common ML libraries and data engineering tools.
Bonus:
  • Experience with SRE-best practices and modern observability (OTEL, distributed tracing).
  • Strong on ML fundamentals: classification/regression, clustering, dimensionality reduction, evaluation + error analysis, probabilistic ML.
  • Experience with agent architectures: multi-step reasoning, tool use, context management.
Who you are:
  • Identify as a builder.
  • Are excited to work in-person from our new and spacious office in New York.
  • Love working in a startup environment (experience in a startup or obsession with going zero-to-one).
  • Enjoy working with people who are ambitious, caring, and think in systems.
  • Thrive in a fast-paced iterative environment where experimentation is essential.
What we bring:
  • Pay & ownershipCompetitive salary with generous equity grants.
  • Full coverage + retirement — Fully covered health, dental, and vision, plus retirement benefits.
  • Unlimited PTO — Take the time you need to recharge.
  • Dinner on late nights — Working late? Dinner is on us.
  • Fitness stipend — Monthly support for your health and wellness.
  • Tools of the trade — Any equipment you need to do your best work.
  • Commute perks — Citi Bike + train benefits.
Interview process
  • Application Review – Send us your stuff, and a quick note on why you're excited.
  • Intro Chat – Share what you're looking for next and learn more about what we're building.
  • Founder Interview – Talk with one of our founders in more detail about the role.
  • Technical Interview – We’ll have you complete a short exercise specific to the role.
  • Onsite – Come onsite and meet the team through a series of 1:1 interviews.
  • Decision – We’ll move fast.
  • Compensation Range: $225K - $325K

    #J-18808-Ljbffr
    To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
    (If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
     
     
     
    Search for further Jobs Here:
    (Try combinations for better Results! Or enter less keywords for broader Results)
    Location
    Increase/decrease your Search Radius (miles)

    Job Posting Language
    Employment Category
    Education (minimum level)
    Filters
    Education Level
    Experience Level (years)
    Posted in last:
    Salary