This job post has expired on February 08, 2026. It is likely that the position has already been filled.

AI Generalist - English Language Evaluation at Mercor

posted 6 months ago

mercor.com Contractor remote: US/UK/CA $45/hour 889 views

AI Generalist - English Language Evaluation | $45/hour | Remote (US, UK, Canada)

Mercor is seeking skilled generalists to evaluate and improve conversational AI systems used by millions worldwide. This flexible contract role puts you at the forefront of human-in-the-loop AI development, where your expertise directly shapes how advanced language models communicate.

Why This Role Exists

We partner with leading AI teams to enhance the quality, usefulness, and reliability of general-purpose conversational AI systems. Your work will ensure these models respond accurately, clearly, and helpfully across diverse real-world scenarios.

What You'll Do

Evaluate LLM-generated responses for accuracy, clarity, and effectiveness
Conduct fact-checking using trusted public sources and external tools
Generate high-quality human evaluation data through detailed annotations
Assess reasoning quality, tone, completeness, and conversational alignment
Apply consistent annotations following clear taxonomies and evaluation guidelines
Identify factual inaccuracies, reasoning errors, and communication gaps

Who You Are

Hold a Bachelor's degree
Have significant experience using large language models (LLMs)
Possess excellent writing skills with ability to articulate nuanced feedback
Demonstrate strong attention to detail and catch subtle issues
Are adaptable and comfortable working across diverse topics and domains
Have background in structured analytical thinking (research, policy, analytics, linguistics, engineering)
Possess excellent college-level mathematics skills

Nice-to-Have Specialties

Prior experience with RLHF, model evaluation, or data annotation
Experience writing or editing high-quality content
Experience making fine-grained qualitative judgments between outputs
Familiarity with evaluation rubrics, benchmarks, or quality scoring systems

What Success Looks Like

You'll produce clear, consistent evaluation artifacts that lead to measurable improvements in AI response quality. Your feedback will help ensure AI systems meet the highest standards before public release, directly impacting user experience for millions.

Work Arrangement

This is a flexible, remote contract position available as full-time or part-time work. Fluent English language skills required. Competitive contract rates aligned with expertise and scope of work.

Apply on Mercor Go back

Show all jobs of Mercor

How to apply for this role

Upload your resume — keep it up-to-date and in English. Mercor will auto-fill your profile from it.
Complete the AI interview — a 15-minute conversation about your experience. Be ready to discuss specific projects and challenges you've solved.
Submit your application — only about 20% of applicants finish all the steps, so completing yours puts you well ahead.

Benture is an independent job board and is not affiliated with Mercor.

AI Generalist - English Language Evaluation at Mercor

How to apply for this role

Related Jobs

Mercor

$170/hr remote in US

Mercor

$70/hr remote

Mercor

$180/hr remote in US

Mercor

$150/hr remote in US

Mercor

$120/hr remote in US

Mercor

$50-70/hr remote

Mercor

$210/hr remote

Mercor

$30/hr Remote

Mercor

$38/hr remote

Mercor

$38/hr remote in Germany

Mercor

$38/hr remote

Mercor

$30/hr remote