Generative AI Evaluator | Remote

Remote / Online - Candidates ideally in
32274, Surabaya, Indonesia

Listing for: Crossing Hurdles

Remote/Work from Home position
Listed on 2026-02-28

Job specializations:

IT/Tech
Data Scientist, Data Analyst

Salary/Wage Range or Industry Benchmark: 30 USD Hourly USD 30.00 HOUR

Position: Generative AI Evaluator | $30/hr Remote

Type: Hourly contract

Compensation: $20–$30/hour

Location: Remote

Commitment: 10–40 hours/week

Role Responsibilities

Evaluate outputs from large language models and autonomous agent systems using defined rubrics and quality standards.
Review multi-step agent workflows, including screenshots and reasoning traces, to assess accuracy and completeness.
Apply benchmarking criteria consistently while identifying edge cases and recurring failure patterns.
Provide structured, actionable feedback to support model refinement and product improvements.
Participate in calibration sessions to ensure consistent evaluation alignment across reviewers.
Adapt to evolving guidelines and ambiguous scenarios with sound judgment.
Document findings clearly and communicate insights to relevant stakeholders.

Requirements

Strong experience in LLM evaluation, AI output analysis, QA/testing, UX research, or similar analytical roles.
Proficiency in rubric-based scoring, benchmarking frameworks, and AI quality assessment.
Excellent attention to detail with strong decision-making skills in ambiguous cases.
Proficient English communication skills (written and verbal).
Ability to work independently in a remote environment.
Comfortable committing to structured evaluation workflows and evolving guidelines.

#J-18808-Ljbffr

Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
View / Apply for Jobs
Matching My Jurisdiction


Increase/decrease your Search Radius (miles)



Job Posting Language