Benture logo
 ←  next job →

Math PhD - AI Model Evaluator at Mercor

posted 18 hours ago
mercor.com Contractor remote: US/UK/CA/EU $73.29/hr 44 views

Math PhD - AI Model Evaluator | $73.29/hr | Remote (US, UK, Canada, EU)

Join Mercor in shaping the future of conversational AI by applying your mathematical expertise to evaluate and improve how AI systems reason about complex mathematical problems. This flexible contract role allows you to work remotely while making a meaningful impact on AI reliability and accuracy.

Why This Role Exists

Mercor partners with leading AI teams to enhance the quality and reliability of general-purpose conversational AI systems. In mathematical contexts, these systems must demonstrate precise formal reasoning, mathematical rigor, and conceptual clarity. Your expertise will directly improve how AI models handle mathematical problems, explanations, and proofs across foundational and advanced areas.

What You'll Do

  • Write and refine prompts to guide AI model behavior in mathematical contexts
  • Evaluate LLM-generated responses for correctness, rigor, and logical coherence
  • Verify mathematical claims, derivations, and proofs using your domain expertise
  • Conduct fact-checking using authoritative sources and domain knowledge
  • Annotate model responses by identifying strengths and areas for improvement
  • Assess clarity, structure, and appropriateness of explanations for different audiences
  • Ensure model responses align with expected conversational behavior and system guidelines
  • Apply consistent evaluation standards following clear taxonomies and benchmarks

Who You Are

  • PhD in Mathematics or a closely related field
  • Demonstrated experience in Probability & Statistics, and ideally one or more of: Algebra & Number Theory, Calculus & Analysis, Geometry & Topology, or Discrete Mathematics, Logic & Computation
  • Significant experience using large language models (LLMs) and understanding their practical applications
  • Excellent writing skills with ability to explain complex mathematical concepts clearly
  • Strong attention to detail and ability to identify subtle issues
  • Experience reviewing or editing technical or academic writing

Nice-to-Have Specialties

  • Prior experience with RLHF, model evaluation, or data annotation work
  • Experience teaching or explaining mathematical concepts to non-expert audiences
  • Familiarity with evaluation rubrics, benchmarks, or structured review frameworks

What Success Looks Like

  • You identify inaccuracies or weak reasoning in mathematical model outputs
  • Your feedback improves the rigor, clarity, and correctness of AI explanations
  • You deliver consistent, reproducible evaluation artifacts that strengthen model performance
  • You help build AI systems that users can trust in mathematical contexts

Contract Details

This is a flexible, remote contract position available for full-time or part-time engagement. Fluent English language skills required.

Benture is an independent job board and is not affiliated with or employed by Mercor.

Tips for Applying to Mercor Jobs from Benture

Increase your chances of success!
1
Four Simple Steps

Upload resumeAI interviewComplete formSubmit application

2
Perfect Your Resume

Upload your best, up-to-date resume in English. Mercor will extract details and fill out your profile automatically. Review and adjust as needed.

3
Complete = Win

SHOCKING FACT: Only ~20% of applicants complete their application! Take the 15-minute AI interview about your experience and you'll have MUCH HIGHER chances of getting hired!

AI Interview Tips: The interview focuses on your resume and work experience. Be ready to discuss specific projects and how you solved challenges.

Takes about 15 minutes | Dramatically improves your chances

Related Jobs

Benture logo
See All Jobs