Benture logo
 ←  next job →

AI Generalist - English Language Evaluation at Mercor

posted 1 day ago
mercor.com Contractor remote: US/UK/CA $45/hour 45 views

AI Generalist - English Language Evaluation | $45/hour | Remote (US, UK, Canada)

Mercor is seeking skilled generalists to evaluate and improve conversational AI systems used by millions worldwide. This flexible contract role puts you at the forefront of human-in-the-loop AI development, where your expertise directly shapes how advanced language models communicate.

Why This Role Exists

We partner with leading AI teams to enhance the quality, usefulness, and reliability of general-purpose conversational AI systems. Your work will ensure these models respond accurately, clearly, and helpfully across diverse real-world scenarios.

What You'll Do

  • Evaluate LLM-generated responses for accuracy, clarity, and effectiveness
  • Conduct fact-checking using trusted public sources and external tools
  • Generate high-quality human evaluation data through detailed annotations
  • Assess reasoning quality, tone, completeness, and conversational alignment
  • Apply consistent annotations following clear taxonomies and evaluation guidelines
  • Identify factual inaccuracies, reasoning errors, and communication gaps

Who You Are

  • Hold a Bachelor's degree
  • Have significant experience using large language models (LLMs)
  • Possess excellent writing skills with ability to articulate nuanced feedback
  • Demonstrate strong attention to detail and catch subtle issues
  • Are adaptable and comfortable working across diverse topics and domains
  • Have background in structured analytical thinking (research, policy, analytics, linguistics, engineering)
  • Possess excellent college-level mathematics skills

Nice-to-Have Specialties

  • Prior experience with RLHF, model evaluation, or data annotation
  • Experience writing or editing high-quality content
  • Experience making fine-grained qualitative judgments between outputs
  • Familiarity with evaluation rubrics, benchmarks, or quality scoring systems

What Success Looks Like

You'll produce clear, consistent evaluation artifacts that lead to measurable improvements in AI response quality. Your feedback will help ensure AI systems meet the highest standards before public release, directly impacting user experience for millions.

Work Arrangement

This is a flexible, remote contract position available as full-time or part-time work. Fluent English language skills required. Competitive contract rates aligned with expertise and scope of work.

Benture is an independent job board and is not affiliated with or employed by Mercor.

Tips for Applying to Mercor Jobs from Benture

Increase your chances of success!
1
Four Simple Steps

Upload resumeAI interviewComplete formSubmit application

2
Perfect Your Resume

Upload your best, up-to-date resume in English. Mercor will extract details and fill out your profile automatically. Review and adjust as needed.

3
Complete = Win

SHOCKING FACT: Only ~20% of applicants complete their application! Take the 15-minute AI interview about your experience and you'll have MUCH HIGHER chances of getting hired!

AI Interview Tips: The interview focuses on your resume and work experience. Be ready to discuss specific projects and how you solved challenges.

Takes about 15 minutes | Dramatically improves your chances

Related Jobs

Benture logo
See All Jobs