This job post has expired on April 06, 2026. It is likely that the position has already been filled.

PhD Rater – STEM & AI Evaluation at Mercor

posted 4 months ago

mercor.com Part Time remote in US 50-100/hr 369 views

PhD Rater – STEM & AI Evaluation | $50–$100/hr | Remote (US)

Mercor is seeking experienced PhD researchers and technical experts to contribute to a frontier AI model evaluation project focused on agentic workflows. You'll design and validate challenging benchmark tasks across data science, machine learning, finance, and coding — helping surface and diagnose reasoning gaps in cutting-edge STEM models.

Key Responsibilities

Design challenging, real-world STEM benchmark problems that rigorously test model reasoning and problem-solving capabilities
Implement each task within an agentic development environment using Python
Analyze model and agent behavior traces to identify and diagnose failure modes beyond surface-level errors
Produce reproducible, testable deliverables with clear specifications and documented environments

Core Qualifications

Deep expertise in data science, machine learning, finance, and/or Python-based coding
Active or recently graduated PhD from a Top 20 U.S.-based institution
Strong research background in frontier STEM topics
Availability for 30+ hours/week, primarily on weekdays
Demonstrated technical output such as high-quality open-source contributions, especially in agentic or LLM tooling ecosystems
Ability to read and reason about agent behavior traces to diagnose complex failure modes

Nice to Have

Familiarity with agentic frameworks and open-source ecosystems such as LangChain, MetaGPT, AutoGen, CrewAI, LlamaIndex, BabyAGI, Dify, and similar tools

About Mercor

Mercor is a talent marketplace connecting top experts with leading AI labs and research organizations. Backed by investors including Benchmark, General Catalyst, Adam D'Angelo, Larry Summers, and Jack Dorsey, Mercor has helped thousands of professionals across law, engineering, research, and creative fields contribute to frontier AI projects shaping the next era of technology.

Apply on Mercor Go back

Show all jobs of Mercor

How to apply for this role

Upload your resume — keep it up-to-date and in English. Mercor will auto-fill your profile from it.
Complete the AI interview — a 15-minute conversation about your experience. Be ready to discuss specific projects and challenges you've solved.
Submit your application — only about 20% of applicants finish all the steps, so completing yours puts you well ahead.

Benture is an independent job board and is not affiliated with Mercor.

PhD Rater – STEM & AI Evaluation at Mercor

Key Responsibilities

Core Qualifications

Nice to Have

About Mercor

How to apply for this role

Related Jobs

Mercor

50-75/hr remote in US, CA

Mercor

190-210/h remote in UK

Mercor

180-230/h remote in UK

Mercor

130-155/h remote in UK

Mercor

150-200/h remote in UK

Mercor

70-100/hr remote in US

Mercor

$50-80/hr remote

Mercor

$50/hr San Francisco, CA

Mercor

$50/hr San Francisco, CA

Mercor

80-110/hr remote in US

Mercor

80-110/hr remote in US

Mercor

60-90/hr remote in US