Benture logo
 ←  next job →

AI Quality Analyst (Personalization) at Turing

posted 2 hours ago
turing.com Contractor remote 15/hr 25 views

AI Quality Analyst (Personalization) – Thai | $15/hr | Remote | 3-Month Contract

Turing is seeking a detail-oriented AI Quality Analyst with Thai language proficiency to evaluate a cutting-edge personalization feature for Gemini. In this role, you will assess how well the AI model leverages personal data — including past conversations, Gmail, Google Search, and YouTube activity — to deliver relevant, helpful responses. This is a unique opportunity to contribute to frontier AI research from anywhere in the world.

About Turing

Based in San Francisco, Turing is the world's leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. Turing helps customers accelerate frontier research and transform AI from proof of concept into proprietary, measurable intelligence.

What You'll Do

  • Design and execute multi-turn conversational prompts (1–5 turns) that require the AI to draw on your personal information and experiences.
  • Evaluate model responses for appropriate personalization based on your original intent.
  • Analyze responses for Grounding issues — ensuring claims are evidence-based and free of hallucinations or flawed inferences.
  • Assess Integration quality to confirm personal data is woven naturally into responses without robotic overnarrating.
  • Conduct rigorous side-by-side (SxS) comparisons of two model responses, ranking them by helpfulness, usability, and overall quality.
  • Write clear, defensible rationales for your evaluations, referencing specific conversation turns.
  • Extract and verify debug information to confirm proper use of chat summaries and data sources.
  • Maintain strict data hygiene by deleting evaluation conversations after each session.

Key Qualifications

  • Thai Proficiency: High-level reading and writing ability in Thai — this is the primary language for the project.
  • Personal Google Account: Willingness to use your primary personal Google account and enable personal data sources for authentic evaluation.
  • Analytical Thinking: Demonstrated ability to evaluate nuanced and ambiguous AI responses, particularly around personalization quality.
  • Prompt Engineering: Experience crafting creative, context-rich, multi-turn prompts to thoroughly test model capabilities.
  • Evaluation Acumen: Strong understanding of personalization concepts, including identifying poor inferences and forced connections.
  • Attention to Detail: Ability to spot subtle differences in naturalness and overnarrating across side-by-side model outputs.
  • Written Communication: Superior ability to write structured, concise rationales with explicit references to conversation turns.
  • Independence: Self-motivated and comfortable working remotely with minimal supervision.
  • Technical Setup: Reliable desktop or laptop with a strong internet connection.

Education & Experience

  • BS/BA degree or equivalent experience in Policy, Law, Ethics, Linguistics, Journalism, Computer Science, or a related analytical field.
  • Prior experience in data annotation, AI quality evaluation, or content moderation is strongly preferred.

Engagement Details

  • Rate: $15/hour
  • Type: Contractor
  • Duration: 3 months
  • Hours: Minimum 30 hrs/week (options: 30 or 40 hrs/week), with at least 4 hours of daily overlap with PST

Evaluation Process

  1. Shortlisted candidates receive a Job Interest Form.
  2. A timed assessment is shared and must be completed within 24 hours.
  3. Successful candidates are contacted to discuss pre-onboarding requirements.

Go back

Related Jobs

Benture logo
See All Jobs