This job post has expired on July 04, 2026. It is likely that the position has already been filled.

AI Safety & Red Team Expert at Mercor

posted 1 month ago

mercor.com Contractor Remote $20-22/hr 268 views

AI Safety & Red Team Expert (English & Marathi) | $20–22/hr | Worldwide Remote

Mercor is building a frontier red team of human data experts who probe AI models for vulnerabilities, surface adversarial risks, and generate the safety data that makes AI systems more trustworthy. This role requires native fluency in both English and Marathi. If you have a background in adversarial AI, cybersecurity, or socio-technical risk — and you instinctively push systems to their breaking points — this role is for you.

What You'll Do

Red team conversational AI models and agents using jailbreaks, prompt injections, misuse cases, bias exploitation, and multi-turn manipulation techniques
Generate high-quality human data by annotating failures, classifying vulnerabilities, and flagging systemic risks
Apply structured frameworks, taxonomies, and benchmarks to ensure consistent and reproducible testing
Produce clear reports, datasets, and attack case documentation that customers can act on

Who You Are

Experienced in red teaming — AI adversarial work, cybersecurity, or socio-technical probing
Naturally curious and adversarial: you find the edge cases others overlook
Structured and methodical: you use frameworks, not just random hacks
A clear communicator who can explain risks to both technical and non-technical audiences
Adaptable and comfortable moving across diverse projects and customer environments

Nice-to-Have Specialties

Adversarial ML: jailbreak datasets, prompt injection, RLHF/DPO attacks, model extraction
Cybersecurity: penetration testing, exploit development, reverse engineering
Socio-technical risk: harassment/disinfo probing, abuse analysis, conversational AI testing
Creative probing: psychology, acting, or writing skills applied to unconventional adversarial thinking

What Success Looks Like

You uncover vulnerabilities that automated tests miss
You deliver reproducible artifacts that strengthen customer AI systems
Evaluation coverage expands — more scenarios tested, fewer surprises in production
Mercor customers trust their AI because you've already probed it like an adversary

Please note: This project involves reviewing AI outputs on sensitive topics such as bias, misinformation, and harmful behaviors. All work is text-based. Participation in higher-sensitivity projects is optional and supported by clear guidelines and wellness resources. Topics will be communicated clearly before any exposure.

Apply on Mercor Go back

Show all jobs of Mercor

How to apply for this role

Upload your resume — keep it up-to-date and in English. Mercor will auto-fill your profile from it.
Complete the AI interview — a 15-minute conversation about your experience. Be ready to discuss specific projects and challenges you've solved.
Submit your application — only about 20% of applicants finish all the steps, so completing yours puts you well ahead.

Benture is an independent job board and is not affiliated with Mercor.

AI Safety & Red Team Expert at Mercor

What You'll Do

Who You Are

Nice-to-Have Specialties

What Success Looks Like

How to apply for this role

Related Jobs

Mercor

$50-70/hr remote

Mercor

$210/hr remote

Mercor

$30/hr Remote

Mercor

$38/hr remote

Mercor

$38/hr remote in Germany

Mercor

$38/hr remote

Mercor

$30/hr remote

Mercor

150-220/h remote

Mercor

225-250/h remote in UK

Mercor

45-70/hr remote in US

Mercor

150-200/h remote in UK

Mercor

150-220/h remote