Benture logo
 ←  next job →
Turing logo

Senior Software Engineer – LLM Eval at Turing

posted 1 hour ago
turing.com Contractor remote in US Varies 33 views

Senior Software Engineer – LLM Evaluation | Contractor | Remote (US Only)

Turing is seeking an experienced Senior Software Engineer to evaluate and improve large language models through high-quality dataset creation, code curation, and AI-generated code assessment. This is a flexible contractor engagement (10–40 hrs/week) ideal for engineers with a strong background in systems programming and production-grade software development.

About Turing

Based in San Francisco, Turing is the world's leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing helps frontier research teams with high-quality data, advanced training pipelines, and top-tier AI researchers — and helps enterprises transform AI from proof of concept into measurable, lasting business impact.

What You'll Do

  • Curate code examples, build solutions, and correct code across Python, C/C++, Rust, Go, Java, and JavaScript (including ReactJS).
  • Evaluate and refine AI-generated code with a focus on systems-level correctness, performance, scalability, and reliability.
  • Collaborate with cross-functional teams to benchmark and improve AI-driven coding solutions.
  • Build agents capable of verifying the quality of systems-level and infrastructure code and identifying error patterns.
  • Hypothesize on software engineering lifecycle stages — from prototyping and architecture design to production, monitoring, and maintenance — and evaluate model capabilities across them.
  • Design automated verification mechanisms to validate solutions to software engineering tasks.

Required Skills

  • 3+ years of professional software engineering experience.
  • Strong expertise in systems programming, infrastructure, or backend development (Python, C/C++, Rust, Go).
  • Proven experience building and deploying scalable, production-grade software.
  • Deep understanding of software architecture, design patterns, debugging, and code quality assessment.
  • Excellent written and verbal communication skills for structured, clear evaluation rationales.

Engagement Details

  • Type: Contractor (no medical/paid leave benefits)
  • Commitment: Flexible — minimum 10 hrs/week, up to 40 hrs/week
  • Duration: 1 month, with potential extensions based on performance
  • Location: Must be based in the United States

Application Process

The application takes approximately 15–30 minutes and includes completion of an AI video interview.

Go back

Related Jobs

Benture logo
See All Jobs