
AI Evaluation Engineer (Python / Java / Web) | Contractor | 40 hrs/week | Worldwide Remote
Turing is seeking experienced Software Engineers to join its AI Evaluation team as AI Benchmark Authors. In this role, you will design and author high-quality evaluation tasks that measure the capabilities of advanced AI agents in real-world software development scenarios — directly influencing the benchmarking of frontier AI models used by leading AI research organizations.