Biology Expert (AI Research) | Contractor | Worldwide Remote
Turing is seeking a highly skilled Biology Expert to help push the boundaries of frontier AI. In this role, you will design and evaluate advanced, graduate-level biology questions used to benchmark and improve state-of-the-art AI models such as Google Gemini and OpenAI's O3 Pro. This is a fully remote contractor engagement ideal for researchers with a Master's or Ph.D. in a biology-related field.
About Turing
Based in San Francisco, CA, Turing is the world's leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing helps accelerate frontier research through high-quality data, advanced training pipelines, and top-tier AI researchers specializing in coding, reasoning, STEM, multilinguality, multimodality, and agents.
What You'll Do Day-to-Day
You'll solve and explain advanced biology problems, integrating both text and visuals. A typical day may include:
- Explaining gene expression regulation with annotated diagrams of transcription and translation processes.
- Solving population genetics problems using Punnett squares, Hardy-Weinberg equations, and evolutionary models.
- Describing physiological systems (e.g., circulatory, respiratory) using integrated anatomical illustrations.
- Answering complex questions on biochemistry, enzymatic reactions, and metabolic pathways using narrative and schematic models.
Key Responsibilities
- Develop High-Level Evaluation (HLE) Questions: Create challenging, novel questions requiring advanced reasoning and specialized knowledge to identify gaps in current AI capabilities.
- Ensure Domain Expertise: Design questions demanding graduate-level depth and precision across highly specialized biology topics.
- Identify Headroom: Formulate questions that challenge Gemini 2.5 Pro and, ideally, OpenAI's O3 Pro as well.
- Ensure Automatic Verifiability: Construct questions with single, definitive, concise answers to enable objective evaluation.
- Incorporate High-Quality Visual Inputs: Develop multimodal questions using clear, high-resolution images.
- Promote Diversity: Contribute to a varied dataset of topics and question types, avoiding over-representation of specific failure modes.
- Require Expert-Level Reasoning: Design questions that demand genuine expert reasoning rather than simple keyword lookups.
- Evaluate Model Responses via Eduarena: Set up side-by-side comparisons between Gemini 2.5 Pro and O3 Pro, and test against O3-DeepResearch.
- Provide Detailed Feedback: Document model failures, explain reasoning discrepancies, and provide correct answers with concise solutions.
- Maintain Detailed Records: Accurately log questions, solutions, final answers, and evaluation session links in a shared tracking sheet.
Offer Details
- Time Commitment: Minimum 30 hours/week (options: 30 or 40 hrs/week), with at least 4 hours of daily overlap with PST.
- Engagement Type: Contractor/Freelancer (no medical or paid leave benefits).
- Contract Duration: 1 month.
Perks of Freelancing With Turing
- Fully remote work environment.
- Opportunity to contribute to cutting-edge AI research with leading LLM companies.
- Potential for contract extension based on performance and project needs.
Evaluation Process
- Shortlisted candidates will receive a job interest form.
- Based on responses, selected candidates will be contacted to discuss pre-onboarding requirements.