Benture logo
 ←  next job →
Turing logo

Senior Python Engineer – LLM Evaluation at Turing

posted 22 hours ago
turing.com Contractor Remote (select) TBD 50 views

Senior Python Engineer – LLM Evaluation & Repository Validation | Contractor | Remote (India, Pakistan, Nigeria, Kenya, Egypt, Ghana, Bangladesh, Turkey, Mexico)

Join Turing, one of the world's fastest-growing AI companies, to help build high-quality LLM evaluation and training datasets. You'll work hands-on with real-world GitHub repositories, assess LLM performance on software engineering tasks, and contribute to the future of AI-assisted development.

About the Project

We are constructing verifiable software engineering tasks derived from public repository histories using a synthetic, human-in-the-loop approach. The goal is to expand dataset coverage across programming languages, difficulty levels, and task types — ultimately training LLMs to solve realistic engineering problems.

What You'll Do

  • Analyze and triage GitHub issues across trending open-source libraries.
  • Set up and configure code repositories, including Dockerization and environment automation.
  • Evaluate unit test coverage and overall test quality.
  • Modify and run codebases locally to assess LLM performance in bug-fixing scenarios.
  • Collaborate with researchers to identify repositories and issues that challenge LLMs.
  • Mentor and lead junior engineers on collaborative project work.

Required Skills

  • 3+ years of professional software engineering experience.
  • Strong proficiency in Python.
  • Solid experience with Git, Docker, and software pipeline setup.
  • Ability to navigate and understand complex, real-world codebases.
  • Comfortable running, modifying, and testing projects in local environments.

Nice to Have

  • Prior involvement in LLM research or evaluation projects.
  • Experience building or testing developer tools or automation agents.
  • History of contributing to or evaluating open-source projects.

Engagement Details

  • Type: Contractor (no medical/paid leave benefits)
  • Duration: 3-month contract, starting next week
  • Commitment: 20, 30, or 40 hrs/week — minimum 4 hrs/day with 4-hour PST overlap required
  • Location: Open to candidates in India, Pakistan, Nigeria, Kenya, Egypt, Ghana, Bangladesh, Turkey, or Mexico

Evaluation Process (~75 minutes)

  • Round 1: 60-minute technical interview
  • Round 2: 30-minute technical and cultural discussion

Go back

Related Jobs

Benture logo
See All Jobs