This job post has expired on February 07, 2026. It is likely that the position has already been filled.

AI Chat Evaluator - English & Arabic at Mercor

posted 6 months ago

mercor.com Contractor remote: EG/SA/AE/US $22.64/hr 600 views

AI Chat Evaluator | $22.64/hr | Remote (Egypt, Saudi Arabia, UAE, USA)

Join Mercor in shaping the future of conversational AI by evaluating and improving large language model responses. This bilingual role requires fluency in both English and Arabic to assess AI-generated content for accuracy, clarity, and helpfulness.

Why This Role Exists

Mercor partners with leading AI teams to enhance general-purpose conversational AI systems used by millions worldwide. Your expertise will directly improve how AI communicates with users across diverse topics and professional scenarios, ensuring responses are accurate, well-reasoned, and aligned with human expectations.

What You'll Do

Evaluate LLM-generated responses for effectiveness, accuracy, and conversational quality
Conduct thorough fact-checking using trusted public sources and external tools
Provide detailed annotations on response strengths, weaknesses, and factual inaccuracies
Assess reasoning quality, clarity, tone, and completeness of AI responses
Ensure model outputs align with conversational best practices and system guidelines
Apply consistent evaluation standards following detailed taxonomies and benchmarks

Who You Are

Hold a Bachelor's degree
Native speaker or C2-level fluency in Arabic (ILR 5/CEFR C2)
Significant experience using and understanding large language models
Excellent writing skills with ability to articulate nuanced feedback clearly
Strong attention to detail and ability to identify subtle issues
Adaptable across diverse topics, domains, and requirements
Background in structured analytical thinking (research, policy, analytics, linguistics, engineering)
Excellent college-level mathematics skills

Nice-to-Have Qualifications

Experience with RLHF, model evaluation, or data annotation
Background in writing or editing high-quality content
Experience making fine-grained qualitative judgments between multiple outputs
Familiarity with evaluation rubrics, benchmarks, or quality scoring systems

What Success Looks Like

Identifying factual inaccuracies, reasoning errors, and communication gaps effectively
Producing clear, consistent, and reproducible evaluation artifacts
Contributing feedback that leads to measurable improvements in AI response quality
Helping ensure AI systems meet quality standards before public release

Why Join Mercor

Work at the frontier of human-in-the-loop AI development with flexible, remote contract opportunities. Your contributions will directly shape how advanced language models behave in real-world applications used by millions. Competitive contract rates aligned with expertise and scope of work.

Apply on Mercor Go back

Show all jobs of Mercor

How to apply for this role

Upload your resume — keep it up-to-date and in English. Mercor will auto-fill your profile from it.
Complete the AI interview — a 15-minute conversation about your experience. Be ready to discuss specific projects and challenges you've solved.
Submit your application — only about 20% of applicants finish all the steps, so completing yours puts you well ahead.

Benture is an independent job board and is not affiliated with Mercor.

AI Chat Evaluator - English & Arabic at Mercor

How to apply for this role

Related Jobs

Mercor

$170/hr remote in US

Mercor

$70/hr remote

Mercor

$180/hr remote in US

Mercor

$150/hr remote in US

Mercor

$120/hr remote in US

Mercor

$50-70/hr remote

Mercor

$210/hr remote

Mercor

$30/hr Remote

Mercor

$38/hr remote

Mercor

$38/hr remote in Germany

Mercor

$38/hr remote

Mercor

$30/hr remote