Benture logo
 ←  next job →

AI Chat Evaluator - English & Arabic at Mercor

posted 21 hours ago
mercor.com Contractor remote: EG/SA/AE/US $22.64/hr 19 views

AI Chat Evaluator | $22.64/hr | Remote (Egypt, Saudi Arabia, UAE, USA)

Join Mercor in shaping the future of conversational AI by evaluating and improving large language model responses. This bilingual role requires fluency in both English and Arabic to assess AI-generated content for accuracy, clarity, and helpfulness.

Why This Role Exists

Mercor partners with leading AI teams to enhance general-purpose conversational AI systems used by millions worldwide. Your expertise will directly improve how AI communicates with users across diverse topics and professional scenarios, ensuring responses are accurate, well-reasoned, and aligned with human expectations.

What You'll Do

  • Evaluate LLM-generated responses for effectiveness, accuracy, and conversational quality
  • Conduct thorough fact-checking using trusted public sources and external tools
  • Provide detailed annotations on response strengths, weaknesses, and factual inaccuracies
  • Assess reasoning quality, clarity, tone, and completeness of AI responses
  • Ensure model outputs align with conversational best practices and system guidelines
  • Apply consistent evaluation standards following detailed taxonomies and benchmarks

Who You Are

  • Hold a Bachelor's degree
  • Native speaker or C2-level fluency in Arabic (ILR 5/CEFR C2)
  • Significant experience using and understanding large language models
  • Excellent writing skills with ability to articulate nuanced feedback clearly
  • Strong attention to detail and ability to identify subtle issues
  • Adaptable across diverse topics, domains, and requirements
  • Background in structured analytical thinking (research, policy, analytics, linguistics, engineering)
  • Excellent college-level mathematics skills

Nice-to-Have Qualifications

  • Experience with RLHF, model evaluation, or data annotation
  • Background in writing or editing high-quality content
  • Experience making fine-grained qualitative judgments between multiple outputs
  • Familiarity with evaluation rubrics, benchmarks, or quality scoring systems

What Success Looks Like

  • Identifying factual inaccuracies, reasoning errors, and communication gaps effectively
  • Producing clear, consistent, and reproducible evaluation artifacts
  • Contributing feedback that leads to measurable improvements in AI response quality
  • Helping ensure AI systems meet quality standards before public release

Why Join Mercor

Work at the frontier of human-in-the-loop AI development with flexible, remote contract opportunities. Your contributions will directly shape how advanced language models behave in real-world applications used by millions. Competitive contract rates aligned with expertise and scope of work.

Benture is an independent job board and is not affiliated with or employed by Mercor.

Tips for Applying to Mercor Jobs from Benture

Increase your chances of success!
1
Four Simple Steps

Upload resumeAI interviewComplete formSubmit application

2
Perfect Your Resume

Upload your best, up-to-date resume in English. Mercor will extract details and fill out your profile automatically. Review and adjust as needed.

3
Complete = Win

SHOCKING FACT: Only ~20% of applicants complete their application! Take the 15-minute AI interview about your experience and you'll have MUCH HIGHER chances of getting hired!

AI Interview Tips: The interview focuses on your resume and work experience. Be ready to discuss specific projects and how you solved challenges.

Takes about 15 minutes | Dramatically improves your chances

Related Jobs

Benture logo
See All Jobs