Benture logo
 ←  next job →

AI Quality Analyst (Personalization) at Turing

posted 2 hours ago
turing.com Contractor remote 15/hr 28 views

AI Quality Analyst (Personalization) – Korean | $15/hr | Remote | 3-Month Contract

Turing is seeking a detail-oriented AI Quality Analyst with Korean language proficiency to evaluate a cutting-edge personalization feature for Gemini. In this role, you will assess how well the AI model leverages personal data — including past conversations, Gmail, Google Search, and YouTube activity — to deliver relevant, helpful responses. This is a unique opportunity to contribute to frontier AI research at a global scale.

About Turing

Based in San Francisco, Turing is the world's leading research accelerator for frontier AI labs and a trusted partner for enterprises deploying advanced AI systems. Turing helps customers accelerate frontier research and transform AI from proof of concept into proprietary, measurable intelligence.

Key Responsibilities

  • Design and execute multi-turn conversational prompts (1–5 turns) that require the AI to utilize your personal information and experiences.
  • Evaluate model responses for appropriate personalization based on your original intent.
  • Analyze responses for Grounding issues — ensuring claims are evidence-based and free from hallucinations or flawed inferences.
  • Assess Integration quality to confirm personal data is woven naturally into responses without robotic overnarrating.
  • Conduct rigorous side-by-side (SxS) comparisons of two model responses, ranking them by helpfulness, usability, and overall quality.
  • Write clear, defensible rationales for your evaluations, referencing specific conversation turns.
  • Extract and verify debug information to confirm proper use of chat summaries and data sources.
  • Maintain strict data hygiene by deleting evaluation conversations after each session.

Key Qualifications

  • Korean Proficiency: High-level reading and writing ability in Korean (primary focus language for this project).
  • Personal Google Account: Willingness to use your primary personal Google account and enable personal data sources for authentic evaluation.
  • Availability: Full-time availability in your local time zone with at least 4 hours of overlap with PST.
  • Analytical Thinking: Proven ability to evaluate nuanced and ambiguous AI responses, particularly around personalization quality.
  • Prompt Engineering: Experience designing creative, context-driven, multi-turn prompts to thoroughly test model capabilities.
  • Attention to Detail: Ability to identify subtle differences in naturalness, overnarrating, and incorrect personalization.
  • Written Communication: Superior ability to write structured, concise rationales for model rankings.
  • Independence: Self-motivated and comfortable working remotely with minimal supervision.
  • Technical Setup: Reliable desktop or laptop with a strong internet connection.

Education & Experience

  • BS/BA degree or equivalent experience in Policy, Law, Ethics, Linguistics, Journalism, Computer Science, or a related analytical field.
  • Prior experience in data annotation, AI quality evaluation, or content moderation is strongly preferred.

Engagement Details

  • Rate: $15/hour
  • Type: Contractor
  • Duration: 3 months
  • Hours: 30 or 40 hours/week (minimum 4 hours/day with 4 hours PST overlap)

Evaluation Process

  1. Shortlisted candidates receive a Job Interest Form.
  2. A timed assessment is shared and must be completed within 24 hours.
  3. Successful candidates are contacted to discuss pre-onboarding requirements.

Go back

Related Jobs

Benture logo
See All Jobs