Benture logo
 ←  next job →
Mercor logo

PhD Rater – STEM & AI Evaluation at Mercor

posted 20 days ago
mercor.com Part Time remote in US 50-100/hr 112 views

PhD Rater – STEM & AI Evaluation | $50–$100/hr | Remote (US)

Mercor is seeking experienced PhD researchers and technical experts to contribute to a frontier AI model evaluation project focused on agentic workflows. You'll design and validate challenging benchmark tasks across data science, machine learning, finance, and coding — helping surface and diagnose reasoning gaps in cutting-edge STEM models.

Key Responsibilities

  • Design challenging, real-world STEM benchmark problems that rigorously test model reasoning and problem-solving capabilities
  • Implement each task within an agentic development environment using Python
  • Analyze model and agent behavior traces to identify and diagnose failure modes beyond surface-level errors
  • Produce reproducible, testable deliverables with clear specifications and documented environments

Core Qualifications

  • Deep expertise in data science, machine learning, finance, and/or Python-based coding
  • Active or recently graduated PhD from a Top 20 U.S.-based institution
  • Strong research background in frontier STEM topics
  • Availability for 30+ hours/week, primarily on weekdays
  • Demonstrated technical output such as high-quality open-source contributions, especially in agentic or LLM tooling ecosystems
  • Ability to read and reason about agent behavior traces to diagnose complex failure modes

Nice to Have

  • Familiarity with agentic frameworks and open-source ecosystems such as LangChain, MetaGPT, AutoGen, CrewAI, LlamaIndex, BabyAGI, Dify, and similar tools

About Mercor

Mercor is a talent marketplace connecting top experts with leading AI labs and research organizations. Backed by investors including Benchmark, General Catalyst, Adam D'Angelo, Larry Summers, and Jack Dorsey, Mercor has helped thousands of professionals across law, engineering, research, and creative fields contribute to frontier AI projects shaping the next era of technology.

How to apply for this role
  • Upload your resume — keep it up-to-date and in English. Mercor will auto-fill your profile from it.
  • Complete the AI interview — a 15-minute conversation about your experience. Be ready to discuss specific projects and challenges you've solved.
  • Submit your application — only about 20% of applicants finish all the steps, so completing yours puts you well ahead.
Benture is an independent job board and is not affiliated with Mercor.

Related Jobs

Benture logo
See All Jobs