Benture logo
next job →
Turing logo

Agentic Coding Annotator at Turing

posted 1 hour ago
turing.com Contractor remote Varies 4 views

Agentic Coding Annotator | Contractor | Remote | ~5-Week Engagement

Turing is seeking experienced software practitioners to evaluate and improve datasets for agentic coding models. This is a technically demanding annotation role requiring real engineering maturity — not a basic data labeling position. You'll work within an agentic coding harness, reviewing model trajectories, verifying solutions, and producing high-quality annotations that directly advance frontier AI capabilities.

About Turing

Turing is one of the world's fastest-growing AI companies, partnering with leading AI labs to advance frontier model capabilities in coding, reasoning, agentic behavior, and more. We build real-world AI systems that solve mission-critical challenges for top-tier organizations globally.

Role Overview

Depending on your assignment, work will fall into one of two tracks:

  • Online Evaluations: Interact with blinded models on predefined tasks, then rank and grade resulting trajectories.
  • Offline Evaluations: Design realistic coding tasks, calibrate them via user simulation, write task-specific rubrics, and grade generated trajectories.

Day-to-Day Responsibilities

  • Execute realistic coding tasks within the agentic coding harness while maintaining model blindness and session independence
  • Verify model outputs by reading code, running commands, checking logs, and inspecting generated artifacts
  • Perform targeted validation using tests, scripts, and manual checks
  • Write clear, evidence-based rationales for trajectory rankings and assessments
  • Design multi-step coding tasks with structured user intent and milestone breakdowns (offline track)
  • Create and refine task-specific rubrics and binary evaluation criteria
  • Review completed work for quality, consistency, and schema compliance
  • Escalate broken environments or process gaps with clear supporting evidence

Required Qualifications

  • 5+ years of experience in software engineering, QA, developer tooling, data/ML engineering, or similar code-heavy roles
  • Strong hands-on proficiency in at least 1–2 languages or ecosystems such as: Python, JavaScript/TypeScript, Rust, Java, C/C++, Bash, Haskell, Swift, or SQL
  • Ability to read unfamiliar codebases, debug issues, interpret test output, and evaluate functional correctness

Preferred Qualifications (Offline / Senior Candidates)

  • Strong Docker skills and experience building reproducible environments
  • Experience navigating large, complex repositories
  • Demonstrated engineering judgment in defining non-trivial technical problems
  • Ability to design realistic tasks that go beyond tutorials or simple bug fixes

Engagement Details

  • Commitment: 8 hours/day with a 4-hour overlap with PST
  • Type: Contractor (no medical/paid leave included)
  • Duration: ~5 weeks, starting next week
  • Compensation: Competitive, based on experience and project scope

Go back

Related Jobs

Benture logo
See All Jobs