This job post has expired on May 09, 2026. It is likely that the position has already been filled.

Senior Software Engineer – LLM Eval at Turing

posted 3 months ago

turing.com Contractor remote in US Varies 356 views

Senior Software Engineer – LLM Evaluation | Contractor | Remote (US Only)

Join Turing, the world's leading AI research accelerator, and help shape the future of large language models. In this role, you'll create high-quality datasets, evaluate AI-generated code, and collaborate with frontier AI researchers — all on a flexible contractor basis with a minimum of 10 hours per week.

About Turing

Based in San Francisco, Turing partners with frontier AI labs and global enterprises to accelerate AI research and deploy reliable, high-impact AI systems. Our expertise spans software engineering, logical reasoning, STEM, multilinguality, multimodality, and autonomous agents.

Role Overview

As a Software Engineering Evaluator, you will curate and refine code datasets used to train and benchmark large language models. Your work will directly influence the quality and capability of next-generation AI systems, with a strong focus on systems-level programming, performance-critical applications, and infrastructure.

What You'll Do

Curate code examples, build solutions, and correct code in Python, C/C++, Rust, Go, Java, and JavaScript (including ReactJS).
Evaluate and refine AI-generated code for correctness, performance, scalability, and reliability.
Build agents to verify the quality of systems-level and infrastructure code and identify error patterns.
Design automated verification mechanisms for software engineering tasks.
Collaborate with cross-functional teams to benchmark AI-driven coding solutions against industry standards.
Analyze and evaluate model capabilities across the full software engineering lifecycle — from prototyping and architecture design to production, monitoring, and maintenance.

Required Skills

3+ years of professional software engineering experience.
Strong expertise in systems programming, infrastructure, or backend development using Python, C/C++, Rust, or Go.
Proven experience building and deploying scalable, production-grade software.
Deep understanding of software architecture, design patterns, debugging, and code quality assessment.
Excellent written and verbal communication skills for producing clear, structured evaluation rationales.

Ideal Background

This role is a strong fit for engineers with experience at frontier AI or technology organizations such as OpenAI, NVIDIA, Databricks, Palantir, or Snowflake. Graduates from top-tier programs are welcome, though exceptional skill and experience always take precedence.

Engagement Details

Type: Contractor (no medical or paid leave benefits)
Commitment: Flexible — minimum 10 hrs/week, up to 40 hrs/week
Duration: 1 month, with potential extensions based on performance
Location: Must be based in the United States

Application Process

The application takes approximately 15–30 minutes and includes an AI video interview. Apply today to contribute to cutting-edge AI research at the frontier of the field.

Apply on Turing Go back

Show all jobs of Turing

Senior Software Engineer – LLM Eval at Turing

Related Jobs

Turing

TBD remote in US

Turing

TBD remote in US

Turing

Varies Remote

Turing

Varies remote

Turing

Varies Remote (ex-US)

Turing

TBD Remote (ex-US)

Turing

Varies Remote (ex-US)

Turing

Varies Remote (ex-US)

Turing

Varies Remote (Non-US)

Turing

Varies Remote (non-US)

Turing

TBD remote in US

Turing

TBD remote