AI Safety & Policy Analyst | Contract | Fully Remote | Turing
Join Turing — the world's leading research accelerator for frontier AI labs — and play a critical role in shaping the safety and alignment of next-generation large language models. This is a hands-on, adversarial evaluation role where your analytical thinking and policy expertise directly influence how AI systems behave responsibly.
Note: This role may involve reviewing sensitive, disturbing, or potentially distressing content as part of AI safety evaluations. Selected candidates may be required to sign an acknowledgment form confirming understanding and consent.
About Turing
Based in San Francisco, Turing partners with frontier AI labs and global enterprises to accelerate research and deploy advanced AI systems. Turing specializes in high-quality training data, advanced pipelines, and top-tier AI researchers across coding, reasoning, STEM, multilinguality, multimodality, and agents.
Role Overview
As an AI Safety & Policy Analyst, you will challenge model safeguards, uncover vulnerabilities, and build the evaluation rubrics used to train and test cutting-edge LLMs. This role demands creativity, analytical rigor, and a deep understanding of trust & safety policy.
Day-to-Day Responsibilities
- Design and execute creative, multi-turn conversational prompts to test model compliance with complex safety policies (e.g., discriminatory content, harmful advice, copyright violations).
- Identify, analyze, and document model failures — including successful jailbreaks and subtle policy violations.
- Develop detailed, objective evaluation rubrics with priority scores (Crucial, Important, Less Important) to define desired model behavior.
- Rigorously evaluate and stack-rank multiple model responses using your rubrics to discriminate between good, bad, and nuanced failures.
- Write clear, defensible rationales for your rankings that provide strong training signals for AI engineers.
- Collaborate with researchers and policy-makers to understand emerging risks and refine the safety taxonomy.
Requirements
- BS/BA degree or equivalent experience in Policy, Law, Ethics, Linguistics, Journalism, Computer Science, or a related analytical field.
- Experience in content moderation, policy analysis, AI safety evaluation, or a related role is strongly preferred.
- High English proficiency — both written and verbal.
- Proven analytical ability to evaluate nuanced, complex, and ambiguous information against defined policy criteria.
- Experience in red teaming, prompt engineering, or adversarial challenge design to test AI safety filters.
- Strong understanding of Trust & Safety principles for LLMs, with expertise in at least one domain: cyberharm, violence/terrorism, bias & stereotypes, mental health & self-harm, child safety, sexually explicit content, misinformation, fraud, sycophancy, regulated goods, privacy & identity, copyright, or legal/medical/financial information.
- Meticulous attention to detail in designing precise, self-contained evaluation rubrics.
- Excellent written communication skills for articulating complex model ranking rationale.
- Familiarity with RLHF (Reinforcement Learning from Human Feedback) workflows is a significant plus.
- Self-motivated and able to work independently in a remote environment.
- Reliable desktop/laptop with a strong internet connection.
Engagement Details
- Type: Contractor / Freelancer (no medical or paid leave benefits)
- Duration: 1-month contract (with potential for extension based on performance)
- Hours: Minimum 4 hours/day, 40 hours/week with 2–4 hours of daily overlap with PST (UTC-8, America/Los Angeles)
Benefits
- Flexible working hours and fully remote environment.
- Opportunity to contribute to cutting-edge AI safety projects with leading LLM companies.
- Potential for contract extension based on performance and project needs.
Application Process
Shortlisted candidates will receive automated analytical challenges. Pass them, and you're ready to get started!