This job post has expired on May 09, 2026. It is likely that the position has already been filled.

Senior Python Engineer – LLM Eval at Turing

posted 3 months ago

turing.com Contractor remote in US Varies 333 views

Senior Python Engineer – LLM Evaluation | Contractor | Remote (US Only)

Join Turing, the world's leading AI research accelerator, as a Senior Python Engineer focused on LLM Evaluation. In this role, you'll help shape the future of large language models by building high-quality datasets, evaluating AI-generated code, and collaborating with frontier AI researchers. This is a flexible contractor engagement (10–40 hrs/week) ideal for experienced engineers with a background at top AI-driven organizations.

About Turing

Headquartered in San Francisco, Turing accelerates frontier AI research and helps global enterprises deploy advanced AI systems with measurable, lasting impact. We partner with leading AI labs and enterprises to transform proof-of-concept AI into proprietary, production-grade intelligence.

What You'll Do

Curate code examples, build solutions, and correct code across Python, JavaScript (React, Node.js), C/C++, Java, Rust, and Go for AI model training initiatives.
Evaluate and refine AI-generated code for efficiency, scalability, and reliability across backend and frontend contexts.
Collaborate with cross-functional teams to benchmark and enhance AI-driven coding solutions.
Build agents to verify code quality and identify error patterns across full-stack applications.
Analyze and evaluate model capabilities across the full software engineering lifecycle — from prototyping and architecture design to production, monitoring, and maintenance.
Design automated verification mechanisms for software engineering task solutions.

Required Skills

3+ years of professional software engineering experience.
Strong expertise in full-stack development using Python and JavaScript (React, Node.js).
Proven experience deploying scalable, production-grade software with modern tools and languages.
Deep understanding of software architecture, design patterns, debugging, and code quality assessment.
Excellent written and verbal communication skills for structured, clear evaluation rationales.

Ideal Background

This role is best suited for engineers with experience at frontier AI organizations such as OpenAI, NVIDIA, Databricks, Palantir, or Snowflake, or graduates from top CS programs including Stanford, MIT, Carnegie Mellon, UC Berkeley, or Georgia Tech. Exceptional skill and experience always take precedence over pedigree.

Engagement Details

Type: Contractor (no medical/paid leave benefits)
Commitment: Flexible — minimum 10 hrs/week, up to 40 hrs/week
Duration: 1 month, with potential extensions based on performance
Location: Must be based in the United States

Application Process

The application takes approximately 15–30 minutes and includes an AI video interview. Apply today to contribute to cutting-edge AI research at the frontier of intelligent systems.

Apply on Turing Go back

Show all jobs of Turing

Senior Python Engineer – LLM Eval at Turing

Related Jobs

Turing

TBD remote in US

Turing

TBD remote in US

Turing

Varies Remote

Turing

Varies remote

Turing

Varies Remote (ex-US)

Turing

TBD Remote (ex-US)

Turing

Varies Remote (ex-US)

Turing

Varies Remote (ex-US)

Turing

Varies Remote (Non-US)

Turing

Varies Remote (non-US)

Turing

TBD remote in US

Turing

TBD remote