Benture logo
 ←  next job →
Turing logo

Senior Python Engineer – LLM Eval at Turing

posted 2 hours ago
turing.com Contractor remote in US Varies 31 views

Senior Python Engineer – LLM Evaluation | Contractor | Remote (US Only)

Turing is seeking an experienced Senior Python Engineer to join its LLM Evaluation team. In this role, you will help shape the future of AI by creating high-quality datasets, evaluating AI-generated code, and collaborating with researchers to advance large language models. This is a flexible contractor engagement requiring a minimum of 10 hours per week, ideal for seasoned engineers with a background in high-scale production environments.

About Turing

Based in San Francisco, Turing is the world's leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing accelerates frontier research through high-quality data, advanced training pipelines, and top AI researchers — and applies that expertise to help enterprises transform AI from proof of concept into measurable business impact.

What You'll Do

  • Curate code examples, build solutions, and correct code across Python, JavaScript (React, Node.js), C/C++, Java, Rust, and Go to support AI model training initiatives.
  • Evaluate and refine AI-generated code across backend and frontend contexts for efficiency, scalability, and reliability.
  • Collaborate with cross-functional teams to benchmark and enhance AI-driven coding solutions against industry standards.
  • Build agents that verify code quality and identify error patterns across full-stack applications.
  • Hypothesize on software engineering lifecycle stages — from prototyping and architecture design to production, monitoring, and maintenance — and evaluate model capabilities across them.
  • Design automated verification mechanisms to validate solutions to complex software engineering tasks.

Required Skills

  • 3+ years of professional software engineering experience.
  • Strong expertise in full-stack development using Python and JavaScript (React, Node.js).
  • Proven experience deploying scalable, production-grade software with modern languages and tools.
  • Deep understanding of software architecture, design patterns, debugging, and code quality assessment.
  • Excellent written and verbal communication skills for producing clear, structured evaluation rationales.

Engagement Details

  • Type: Contractor (no medical/paid leave benefits)
  • Commitment: Flexible — minimum 10 hrs/week, up to 40 hrs/week
  • Duration: 1 month, with potential extensions based on performance
  • Location: Must be based in the United States

Application Process

The application takes approximately 15–30 minutes and includes completion of an AI video interview. We welcome applicants from top-tier engineering backgrounds and leading academic institutions, though exceptional skill and experience always take precedence.

Go back

Related Jobs

Benture logo
See All Jobs