Benture logo
 ←  next job →
Turing logo

Senior Python Engineer – LLM Eval at Turing

posted 1 hour ago
turing.com Contractor remote in US Varies 30 views

Senior Python Engineer – LLM Evaluation | Contractor | Remote (US Only)

Turing is seeking an experienced Senior Python Engineer to join its LLM Evaluation team. In this role, you will help shape the future of AI by creating high-quality datasets, evaluating AI-generated code, and collaborating with researchers to advance large language models. This is a flexible contractor engagement requiring a minimum of 10 hours per week, ideal for seasoned engineers with production-scale experience.

About Turing

Based in San Francisco, Turing is the world's leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing helps accelerate frontier research with high-quality data, advanced training pipelines, and top AI researchers — and applies that expertise to help enterprises transform AI from proof of concept into measurable business impact.

What You'll Do

  • Curate code examples, build solutions, and correct code across Python, JavaScript (React, Node.js), C/C++, Java, Rust, and Go to support AI model training initiatives.
  • Evaluate and refine AI-generated code across backend and frontend contexts for efficiency, scalability, and reliability.
  • Collaborate with cross-functional teams to benchmark and enhance AI-driven coding solutions against industry standards.
  • Build agents capable of verifying code quality and identifying error patterns across full-stack applications.
  • Hypothesize on software engineering lifecycle stages — from prototyping and architecture design to production, launch, and monitoring — and evaluate model capabilities across them.
  • Design automated verification mechanisms to validate solutions to software engineering tasks.

Required Skills

  • 3+ years of professional software engineering experience.
  • Strong expertise in full-stack development using Python and JavaScript (React, Node.js).
  • Proven experience deploying scalable, production-grade software with modern languages and tools.
  • Deep understanding of software architecture, design, debugging, and code quality assessment.
  • Excellent written and verbal communication skills for producing clear, structured evaluation rationales.

Engagement Details

  • Type: Contractor (no medical/paid leave benefits)
  • Commitment: Flexible — minimum 10 hrs/week, up to 40 hrs/week
  • Duration: 1 month, with potential extensions based on performance
  • Location: Must be based in the United States

Application Process

The application takes approximately 15–30 minutes and includes completion of an AI video interview.

Go back

Related Jobs

Benture logo
See All Jobs