Benture logo
 ←  next job →
Turing logo

Agentic Tasker (Frontier STEM) at Turing

posted 1 hour ago
turing.com Contractor India ~$30/hr 29 views

Agentic Tasker (Frontier STEM) | ~$30/hr | India | 3-Month Contract

Work directly with researchers at a top-tier Frontier AI Lab to enhance the reasoning and problem-solving capabilities of cutting-edge AI models. This role focuses on designing, validating, and analyzing challenging STEM benchmark tasks to push the boundaries of frontier model performance.

Key Responsibilities

  • Task Design & Development: Create challenging, real-world data science problems that serve as the foundation for Colab Bench tasks.
  • Content Generation: Integrate problems into an Agentic development environment using Python, including:
    • Detailed task instructions and overviews
    • Golden solutions that follow provided instructions
    • Full environment setup — datasets, Python libraries, and metadata
    • Test notebooks containing unit tests that solutions must pass
  • Evaluation & Analysis: Assess cross-model performance on designed tasks.
  • Headroom Identification: Identify tasks where the target model fails, specifically classifying failures as logical reasoning issues.
  • Loss Extraction: Analyze agent trajectories to observe and extract core capability loss patterns from the model.

Qualifications

  • Strong expertise in data science, machine learning, finance, and coding
  • Deep background in frontier STEM disciplines
  • Actively recruiting PhD students from top US universities and highly skilled GitHub contributors
  • A small cohort based in India will also be considered

Offer Details

  • Rate: ~$30/hour
  • Commitment: Minimum 30 hours/week on weekdays
  • Employment Type: Contractor (no medical or paid leave benefits)
  • Duration: 3 months, with an expected start date of next week
  • Location: India

Go back

Related Jobs

Benture logo
See All Jobs