Benture logo
 ←  next job →
Mercor logo

AI Safety & Red Team Expert at Mercor

posted 1 hour ago
mercor.com Contractor Remote $20-22/hr 23 views

AI Safety & Red Team Expert (English & Marathi) | $20–22/hr | Worldwide Remote

Mercor is building a frontier red team of human data experts who probe AI models for vulnerabilities, surface adversarial risks, and generate the safety data that makes AI systems more trustworthy. This role requires native fluency in both English and Marathi. If you have a background in adversarial AI, cybersecurity, or socio-technical risk — and you instinctively push systems to their breaking points — this role is for you.

What You'll Do

  • Red team conversational AI models and agents using jailbreaks, prompt injections, misuse cases, bias exploitation, and multi-turn manipulation techniques
  • Generate high-quality human data by annotating failures, classifying vulnerabilities, and flagging systemic risks
  • Apply structured frameworks, taxonomies, and benchmarks to ensure consistent and reproducible testing
  • Produce clear reports, datasets, and attack case documentation that customers can act on

Who You Are

  • Experienced in red teaming — AI adversarial work, cybersecurity, or socio-technical probing
  • Naturally curious and adversarial: you find the edge cases others overlook
  • Structured and methodical: you use frameworks, not just random hacks
  • A clear communicator who can explain risks to both technical and non-technical audiences
  • Adaptable and comfortable moving across diverse projects and customer environments

Nice-to-Have Specialties

  • Adversarial ML: jailbreak datasets, prompt injection, RLHF/DPO attacks, model extraction
  • Cybersecurity: penetration testing, exploit development, reverse engineering
  • Socio-technical risk: harassment/disinfo probing, abuse analysis, conversational AI testing
  • Creative probing: psychology, acting, or writing skills applied to unconventional adversarial thinking

What Success Looks Like

  • You uncover vulnerabilities that automated tests miss
  • You deliver reproducible artifacts that strengthen customer AI systems
  • Evaluation coverage expands — more scenarios tested, fewer surprises in production
  • Mercor customers trust their AI because you've already probed it like an adversary

Please note: This project involves reviewing AI outputs on sensitive topics such as bias, misinformation, and harmful behaviors. All work is text-based. Participation in higher-sensitivity projects is optional and supported by clear guidelines and wellness resources. Topics will be communicated clearly before any exposure.

How to apply for this role
  • Upload your resume — keep it up-to-date and in English. Mercor will auto-fill your profile from it.
  • Complete the AI interview — a 15-minute conversation about your experience. Be ready to discuss specific projects and challenges you've solved.
  • Submit your application — only about 20% of applicants finish all the steps, so completing yours puts you well ahead.
Benture is an independent job board and is not affiliated with Mercor.

Related Jobs

Benture logo
See All Jobs