Chemistry Expert | AI Training Data | Worldwide Remote | 5-Week Contract
Turing is seeking a skilled Chemistry Expert to design high-quality scientific reasoning datasets used to train and evaluate frontier Large Language Models (LLMs). This is a fully remote, 5-week contractor engagement requiring at least 40 hours per week with a minimum 4-hour daily overlap with PST.
About Turing
Based in San Francisco, Turing is the world's leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing accelerates frontier research through high-quality data, advanced training pipelines, and top-tier AI researchers specializing in coding, reasoning, STEM, multilinguality, multimodality, and agents.
Role Overview
In this role, you will create structured, multi-step scientific reasoning tasks that challenge AI models to analyze experimental or simulated data, identify patterns, infer formulas or laws, and apply them to novel scenarios. Your work will directly contribute to improving the scientific reasoning capabilities of cutting-edge LLMs.
Key Responsibilities
- Design scientific scenarios using experimental data, observational records, or simulated systems — including fictional but internally consistent environments.
- Author multi-step reasoning tasks requiring models to analyze data, identify relationships, infer governing rules, estimate parameters, and predict outcomes.
- Develop problem types including law induction from data, parameter recovery, model selection, predictive reasoning, and impossible-scenario detection.
- Create clear context blocks, prompts, reference solutions, and scoring rubrics for each task.
- Anticipate edge cases such as noisy data, boundary conditions, impossible states, missing assumptions, and unit inconsistencies.
- Collaborate with reviewers and LLM engineers to improve task clarity, scientific accuracy, and reproducibility.
- Maintain rigorous quality standards with strong attention to logic, precision, and internal consistency.
Qualifications
- 3+ years of experience in a scientific, research, or analytical role (e.g., chemistry, physics, scientific computing, or data analysis).
- Strong background in scientific reasoning, quantitative modeling, and data interpretation.
- Familiarity with experimental design, mathematical modeling, parameter estimation, dimensional analysis, or simulation-based reasoning.
- Proven ability to design structured reasoning problems and communicate them clearly in writing.
- Strong attention to detail regarding assumptions, ambiguity control, and consistency between data, solutions, and rubrics.
- Experience working with or evaluating LLMs is a plus.
- Degree in Chemistry, Physics, Biology, or a related science discipline preferred.
Deliverables
Each submitted task must include:
- A formal context block with assumptions, variables, definitions, and observations.
- A prompt asking the model to infer, compute, justify, predict, or explain.
- An expected output with a deterministic final answer and reference solution.
- A rubric for evaluating correctness, reasoning quality, and completeness.
All tasks must be self-contained, reproducible, and aligned with model reasoning evaluation goals.
Engagement Details
- Type: Contractor
- Duration: 5 weeks
- Hours: Minimum 40 hours/week, at least 4 hours/day overlap with PST
Evaluation Process
Shortlisted candidates will receive a Job Interest Form and will be contacted to discuss pre-onboarding requirements.