This job post has expired on June 18, 2026. It is likely that the position has already been filled.

Senior Python Engineer – LLM Evaluation at Turing

posted 2 months ago

turing.com Contractor Remote (select) TBD 298 views

Senior Python Engineer – LLM Evaluation & Repository Validation | Contractor | Remote (India, Pakistan, Nigeria, Kenya, Egypt, Ghana, Bangladesh, Turkey, Mexico)

Join Turing, one of the world's fastest-growing AI companies, to help build high-quality LLM evaluation and training datasets. You'll work hands-on with real-world GitHub repositories, assess LLM performance on software engineering tasks, and contribute to the future of AI-assisted development.

About the Project

We are constructing verifiable software engineering tasks derived from public repository histories using a synthetic, human-in-the-loop approach. The goal is to expand dataset coverage across programming languages, difficulty levels, and task types — ultimately training LLMs to solve realistic engineering problems.

What You'll Do

Analyze and triage GitHub issues across trending open-source libraries.
Set up and configure code repositories, including Dockerization and environment automation.
Evaluate unit test coverage and overall test quality.
Modify and run codebases locally to assess LLM performance in bug-fixing scenarios.
Collaborate with researchers to identify repositories and issues that challenge LLMs.
Mentor and lead junior engineers on collaborative project work.

Required Skills

3+ years of professional software engineering experience.
Strong proficiency in Python.
Solid experience with Git, Docker, and software pipeline setup.
Ability to navigate and understand complex, real-world codebases.
Comfortable running, modifying, and testing projects in local environments.

Nice to Have

Prior involvement in LLM research or evaluation projects.
Experience building or testing developer tools or automation agents.
History of contributing to or evaluating open-source projects.

Engagement Details

Type: Contractor (no medical/paid leave benefits)
Duration: 3-month contract, starting next week
Commitment: 20, 30, or 40 hrs/week — minimum 4 hrs/day with 4-hour PST overlap required
Location: Open to candidates in India, Pakistan, Nigeria, Kenya, Egypt, Ghana, Bangladesh, Turkey, or Mexico

Evaluation Process (~75 minutes)

Round 1: 60-minute technical interview
Round 2: 30-minute technical and cultural discussion

Apply on Turing Go back

Show all jobs of Turing

Senior Python Engineer – LLM Evaluation at Turing

About the Project

What You'll Do

Required Skills

Nice to Have

Engagement Details

Evaluation Process (~75 minutes)

Related Jobs

Turing

TBD remote in US

Turing

TBD remote in US

Turing

Varies Remote

Turing

Varies remote

Turing

Varies Remote (ex-US)

Turing

TBD Remote (ex-US)

Turing

Varies Remote (ex-US)

Turing

Varies Remote (ex-US)

Turing

Varies Remote (Non-US)

Turing

Varies Remote (non-US)

Turing

TBD remote in US

Turing

TBD remote