
SwarmBench Task Engineer – Knowledge/Research | Contractor | Fully Remote | ~40 hrs/week
Turing is looking for a highly analytical, research-driven engineer to design and build multi-agent benchmark tasks focused on knowledge synthesis and large-scale document analysis. This is a short-term contract role (1 month) with potential for extension, ideal for someone with a strong research background and hands-on experience in AI evaluation.
Based in San Francisco, Turing is the world's leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing accelerates frontier research with high-quality data, advanced training pipelines, and top AI researchers specializing in coding, reasoning, STEM, multilinguality, multimodality, and agents.
You will craft challenging, insightful benchmark problems in your research domain and devise elegant computational solutions that push the limits of multi-agent AI systems. Your work will directly shape how AI agents are evaluated on complex, real-world research tasks.