This job post has expired on May 09, 2026. It is likely that the position has already been filled.

Senior Python Engineer – LLM Eval at Turing

posted 3 months ago

turing.com Contractor remote in US Varies 434 views

Senior Python Engineer – LLM Evaluation | Contractor | Remote (US Only)

Turing is seeking a Senior Python Engineer to join its LLM Evaluation team, helping shape the future of frontier AI by building and evaluating high-quality datasets used to train and benchmark large language models. This is a flexible contractor engagement (10–40 hrs/week) ideal for engineers with deep systems programming expertise who want meaningful impact in the AI space.

About Turing

Based in San Francisco, Turing is the world's leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing accelerates frontier research through high-quality data, advanced training pipelines, and top-tier AI researchers — and applies that expertise to help enterprises turn AI proof-of-concepts into reliable, production-grade intelligence.

What You'll Do

Curate code examples, build precise solutions, and correct code across Python, C/C++, Rust, Go, Java, and JavaScript (including ReactJS).
Evaluate and refine AI-generated code with a focus on systems-level correctness, performance, scalability, and reliability.
Collaborate with cross-functional teams to benchmark and improve AI-driven coding solutions against industry standards.
Build agents capable of verifying the quality of systems-level and infrastructure code, and identifying error patterns.
Analyze and evaluate model capabilities across the full software engineering lifecycle — from prototyping and architecture design to production, monitoring, and maintenance.
Design automated verification mechanisms to validate solutions to complex software engineering tasks.

Required Skills

3+ years of professional software engineering experience.
Strong expertise in systems programming, infrastructure, or backend development using Python, C/C++, Rust, or Go.
Proven experience building and deploying scalable, production-grade software.
Deep understanding of software architecture, design patterns, debugging, and code quality assessment.
Excellent written and verbal communication skills for producing clear, structured evaluation rationales.

Engagement Details

Type: Contractor (no medical/paid leave benefits)
Commitment: Flexible — minimum 10 hrs/week, up to 40 hrs/week
Duration: 1 month, with potential extensions based on performance
Location: Must be based in the United States

Application Process

The application takes approximately 15–30 minutes and includes an AI video interview. We look forward to learning more about your background and experience.

Apply on Turing Go back

Show all jobs of Turing

Senior Python Engineer – LLM Eval at Turing

Related Jobs

Turing

TBD remote in US

Turing

TBD remote in US

Turing

Varies Remote

Turing

Varies remote

Turing

Varies Remote (ex-US)

Turing

TBD Remote (ex-US)

Turing

Varies Remote (ex-US)

Turing

Varies Remote (ex-US)

Turing

Varies Remote (Non-US)

Turing

Varies Remote (non-US)

Turing

TBD remote in US

Turing

TBD remote