Benture logo
 ←  next job →

Senior Software Engineer – LLM Evaluation at Turing

posted 3 days ago
turing.com Contractor remote: US/CA/WEU 78 views

Senior Software Engineer – LLM Evaluation | Contractor | 10-40 hrs/week | US/Canada/Western Europe

Join Turing, the world's leading research accelerator for frontier AI labs based in San Francisco, as we advance the future of large language models through cutting-edge evaluation and dataset creation.

About the Role

As a Software Engineering Evaluator, you'll create high-quality datasets for training and benchmarking large language models. You'll curate code examples, provide precise solutions, and evaluate AI-generated code across multiple programming languages including Python, JavaScript/ReactJS, C/C++, Java, Rust, and Go.

Key Responsibilities

  • Curate code examples and build solutions for AI model training initiatives across Python, JavaScript (ReactJS), C/C++, Java, Rust, and Go
  • Evaluate and refine AI-generated code for efficiency, scalability, and reliability
  • Collaborate with cross-functional teams to enhance enterprise-level AI-driven coding solutions
  • Build agents that verify code quality and identify error patterns
  • Hypothesize and evaluate model capabilities across the software engineering lifecycle (prototyping, architecture design, API design, production implementation, monitoring, maintenance)
  • Design automated verification mechanisms for software engineering tasks

Required Qualifications

  • 5+ years of software engineering experience, including 2+ years at a top-tier product company (Google, Stripe, Amazon, Apple, Meta, Netflix, Microsoft, Datadog, Dropbox, Shopify, PayPal, IBM Research)
  • Strong expertise in building full-stack applications and deploying scalable, production-grade software
  • Deep understanding of software architecture, design, development, debugging, and code quality assessment
  • Excellent written and oral communication skills for clear evaluation rationales

Engagement Details

  • Commitment: Flexible engagement, 10-40 hours per week (partial PST overlap required)
  • Type: Independent contractor (no medical benefits or paid leave)
  • Duration: 1 month initially, starting immediately; extensions based on performance
  • Location: Must be based in US, Canada, or Western European countries (Austria, Belgium, France, Germany, etc.)

Application Process

  • Complete a simple application form and AI interviewer assessment
  • Pass an automated coding exercise (required for consideration)

Go back

Related Jobs

Benture logo
See All Jobs