Benture logo
 ←  next job →
Turing logo

Senior Software Engineer – LLM Eval at Turing

posted 1 hour ago
turing.com Contractor remote in US Varies 29 views

Senior Software Engineer – LLM Evaluation | Contractor | Remote (US-based)

Turing is seeking an experienced Senior Software Engineer to evaluate and improve large language models through high-quality dataset curation, code assessment, and AI-driven solution refinement. This flexible contractor role is ideal for engineers who have worked at the frontier of AI and want to directly shape the future of intelligent systems.

About Turing

Based in San Francisco, Turing is the world's leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI. Turing accelerates frontier research with high-quality data, advanced training pipelines, and top AI researchers — and helps enterprises transform AI from proof of concept into measurable, lasting business impact.

Role Overview

As a Software Engineering Evaluator, you will create cutting-edge datasets used to train, benchmark, and advance large language models. You'll curate code examples, provide precise solutions, and refine AI-generated code — with a primary focus on Python, alongside JavaScript (ReactJS), C/C++, Java, Rust, and Go.

What You'll Do

  • Curate code examples, build solutions, and correct AI-generated code across Python, JavaScript, C/C++, Java, Rust, and Go.
  • Evaluate AI-generated code for efficiency, scalability, and reliability.
  • Collaborate with cross-functional teams to benchmark AI-driven coding solutions against industry standards.
  • Build agents and automated verification tools in Python to assess code quality and identify error patterns.
  • Analyze stages of the software engineering lifecycle — from prototyping and architecture design to production, monitoring, and maintenance — and evaluate model capabilities at each stage.
  • Design verification mechanisms to automatically validate solutions to software engineering tasks.

Required Skills

  • 3+ years of professional software engineering experience.
  • Strong expertise in Python, including frameworks, tooling, and production-grade best practices.
  • Experience building full-stack applications and deploying scalable software.
  • Deep understanding of software architecture, design, debugging, and code quality review.
  • Excellent written and verbal communication skills for structured evaluation rationales.

Ideal Background

This role is a strong fit for engineers with experience at frontier AI organizations such as OpenAI, NVIDIA, Databricks, Palantir, or Snowflake. Graduates from programs with strong CS foundations — including UW, UIUC, UT Austin, University of Michigan, and Purdue — are especially encouraged to apply, though exceptional skill and experience always take precedence.

Engagement Details

  • Type: Contractor (no medical/paid leave benefits)
  • Commitment: Flexible — minimum 10 hrs/week, up to 40 hrs/week
  • Duration: 1 month, with potential extensions based on performance
  • Location: Must be based in the United States

Application Process

The application takes approximately 15–30 minutes and includes an AI video interview. Apply today to contribute to the cutting edge of AI development.

Go back

Related Jobs

Benture logo
See All Jobs