Benture logo
 ←  next job →
Turing logo

Senior Software Engineer – LLM Eval at Turing

posted 1 hour ago
turing.com Contractor remote in US Varies 30 views

Senior Software Engineer – LLM Evaluation | Contractor | Remote (US Only)

Turing is seeking an experienced Senior Software Engineer to evaluate and improve large language model outputs, with a focus on code quality, software architecture, and AI-driven development tools. This is a flexible contractor engagement (10–40 hrs/week) ideal for engineers who thrive in fast-paced, high-impact environments.

About Turing

Based in San Francisco, Turing is the world's leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing helps frontier research teams build high-quality training data and supports enterprises in transforming AI from proof of concept into reliable, measurable intelligence.

Role Overview

As a Software Engineering Evaluator, you will create cutting-edge datasets used to train, benchmark, and advance large language models. You'll curate code examples, provide precise solutions, and refine AI-generated code — primarily in Python, with additional work across JavaScript (ReactJS), C/C++, Java, Rust, and Go.

What You'll Do

  • Curate code examples and build solutions to support AI model training initiatives across multiple languages.
  • Evaluate and refine AI-generated code for efficiency, scalability, and reliability.
  • Build agents and automated verification tools in Python to assess code quality and identify error patterns.
  • Collaborate with cross-functional teams to benchmark AI-driven coding solutions against industry standards.
  • Analyze and hypothesize on software engineering lifecycle stages — from prototyping and architecture design to production, monitoring, and maintenance.
  • Design automated verification mechanisms to validate solutions to software engineering tasks.

Required Skills

  • 3+ years of professional software engineering experience.
  • Strong expertise in Python, including frameworks, tooling, and production-grade best practices.
  • Experience building full-stack applications and deploying scalable software systems.
  • Deep understanding of software architecture, design patterns, debugging, and code review.
  • Clear, structured written and verbal communication skills for evaluation rationales.

Engagement Details

  • Type: Contractor (no medical/paid leave benefits)
  • Commitment: Flexible — minimum 10 hrs/week, up to 40 hrs/week
  • Duration: 1 month, with potential extensions based on performance
  • Location: Must be based in the United States

Application Process

The application takes approximately 15–30 minutes and includes an AI video interview. Apply today to contribute to the future of AI development at one of the most innovative companies in the space.

Go back

Related Jobs

Benture logo
See All Jobs