Benture logo
 ←  next job →
Turing logo

Agentic Coding Annotator at Turing

posted 2 hours ago
turing.com Contractor remote Varies 33 views

Agentic Coding Annotator | Contractor | Remote | ~8 hrs/day with PST overlap

Turing is seeking experienced software practitioners to evaluate and improve datasets for agentic coding models. This is a high-precision, engineering-heavy annotation role — not a basic data labeling position. You'll work within real agentic coding environments, review model trajectories, verify solutions, and produce detailed, evidence-based annotations.

About Turing

Turing is one of the world's fastest-growing AI companies, partnering with leading AI labs to advance frontier model capabilities in reasoning, coding, agentic behavior, and more. We build real-world AI systems that solve mission-critical challenges for top organizations globally.

What You'll Do

  • Execute realistic coding tasks within agentic coding harnesses while maintaining model blindness and session independence
  • Verify model outputs by reading code, running commands, checking logs, and validating generated artifacts
  • Write clear, specific, evidence-based rationales for trajectory rankings and assessments
  • Design multi-step coding tasks (offline work), including user intent, milestones, and expected behaviors
  • Create and refine task-specific rubrics and binary evaluation criteria
  • Review completed work for quality, consistency, completeness, and schema compliance
  • Identify and escalate broken environments, unclear instructions, or process gaps with supporting evidence

Requirements

  • 5+ years of experience in software engineering, QA, developer tooling, data/ML engineering, or similar code-heavy roles
  • Strong hands-on experience in at least 1–2 programming languages (Python, JavaScript/TypeScript, Rust, Java, C/C++, Bash, Haskell, Swift, SQL, etc.)
  • Ability to read unfamiliar codebases, debug issues, run tests/scripts, and evaluate functional correctness
  • Proficient in Linux/Ubuntu environments, Git, terminal workflows, package managers, and test runners
  • Familiarity with JSON, YAML, Markdown, and ideally Docker
  • Experience with or ability to quickly adapt to agentic coding tools such as OpenCode, Claude Code, or Cursor
  • Strong quality judgment: ability to compare model trajectories, apply rubrics consistently, and write concise rationales

Work Style

  • Highly detail-oriented and process-driven
  • Comfortable with repetitive, high-precision evaluation work
  • Proactively flags ambiguity rather than making assumptions
  • Maintains consistency across long tasks and multiple model runs

Contract Details

  • Commitment: 8 hours/day with a 4-hour overlap with PST
  • Type: Contractor (no medical/paid leave included)
  • Duration: 4 weeks, starting next week

Why Work With Turing?

  • Contribute to cutting-edge AI projects with leading foundation model companies
  • Work at the frontier of LLM evaluation and reasoning
  • Fully remote with flexible, global collaboration
  • Competitive compensation based on experience and project scope

Go back

Related Jobs

Benture logo
See All Jobs