
Senior Python Engineer – LLM Evaluation & Repository Validation | Contractor | Remote (India, Pakistan, Nigeria, Kenya, Egypt, Ghana, Bangladesh, Turkey, Mexico)
Join Turing, one of the world's fastest-growing AI companies, to help build high-quality LLM evaluation and training datasets. You'll work hands-on with real-world GitHub repositories, assess LLM performance on software engineering tasks, and contribute to the future of AI-assisted development.
We are constructing verifiable software engineering tasks derived from public repository histories using a synthetic, human-in-the-loop approach. The goal is to expand dataset coverage across programming languages, difficulty levels, and task types — ultimately training LLMs to solve realistic engineering problems.