Benture logo
 ←  next job →
Mercor logo

Physics Benchmark Expert at Mercor

posted 2 hours ago
mercor.com Contractor remote 80-140/hr 34 views

Physics Benchmark Expert | $80–$140/hr | Worldwide Remote

Join a cutting-edge AI research initiative as a Physics Benchmark Expert, authoring and verifying golden reference solutions for a frontier research-level physics benchmark. Your work will directly shape how large language models are evaluated on advanced reasoning tasks — all verified 100% by human experts like you.

Role Overview

You will be assigned one of three roles based on your seniority and subfield expertise:

  • Solver (Labeller): Produce complete solution packages — write justifications, decompose problems into 2–4 checkpoint sub-problems, provide step-by-step derivations, deliver Python answer templates with auto-grading functions, and document chain-of-thought metadata with peer-reviewed references. Two solvers work each problem independently in parallel.
  • Auditor: Independently review submitted solutions, provide actionable improvement feedback, and sign off on each iteration. At least three auditors are assigned per problem, remaining consistent across iterations.
  • Verifier (Adjudicator): Select the best solution from two parallel solver attempts and, when necessary, send work back for rework to ensure quality.

Physics Subdomains

  • High Energy Physics & Mathematical Physics
  • Biophysics & Statistical Physics
  • Condensed Matter & AMO
  • Gravitation, Cosmology & Astrophysics
  • Quantum Information
  • Optical Properties of Materials
  • Magnetic Materials
  • Measurements in Quantum Mechanics

Key Responsibilities

  • Solve research-level physics challenges end-to-end with verifiable derivations, code, and peer-reviewed references
  • Decompose challenges into standalone checkpoint sub-problems requiring genuine physical reasoning
  • Author Python answer templates with auto-grading functions for symbolic or numerical answers
  • Audit solutions for correctness, scope, and methodological soundness; deliver actionable feedback across iterations
  • Adjudicate between parallel solver attempts to determine the golden reference solution
  • Document chain-of-thought reasoning, error tolerances, equivalent symbolic forms, and verification test cases

Ideal Qualifications

  • Solver: PhD or postdoc in the relevant subfield (senior PhD student minimum)
  • Auditor: Postdoc or junior professor in the relevant subfield (PhD minimum)
  • Adjudicator: Full professor or industry research PI (senior postdoc or junior professor minimum)
  • Hands-on familiarity with at least two canonical methods in the target subfield, demonstrable through publications
  • 3–5 representative publications (arXiv ID or DOI), ideally within the last ~5 years
  • Working proficiency with LaTeX, Python, Jupyter, and SymPy
  • Strong written English (B2/C1/C2 minimum; native or near-native preferred)

Engagement Details

  • Commitment: ~10 hours/week over an 8–10 week window per task pool
  • Pay: $80–$140/hr based on role and demonstrated expertise
  • Work style: Fully asynchronous and remote

How to apply for this role
  • Upload your resume — keep it up-to-date and in English. Mercor will auto-fill your profile from it.
  • Complete the AI interview — a 15-minute conversation about your experience. Be ready to discuss specific projects and challenges you've solved.
  • Submit your application — only about 20% of applicants finish all the steps, so completing yours puts you well ahead.
Benture is an independent job board and is not affiliated with Mercor.

Related Jobs

Benture logo
See All Jobs