Benture logo
 ←  next job →

AI Quality Analyst (Personalization) at Turing

posted 1 hour ago
turing.com Contractor remote 15/hr 20 views

AI Quality Analyst (Personalization) – Turkish | $15/hr | Remote | 3-Month Contract

Turing is seeking a detail-oriented and analytically sharp AI Quality Analyst with Turkish language proficiency to evaluate a cutting-edge personalization feature for Gemini. In this role, you will assess how well the AI model leverages personal data — including past conversations, Gmail, Google Search, and YouTube activity — to deliver relevant, helpful responses. This is a unique opportunity to contribute to frontier AI research at a global scale.

What You'll Do

  • Design and execute multi-turn conversational prompts (1–5 turns) that challenge the AI to utilize your personal information and experiences.
  • Evaluate model responses for personalization quality, assessing dimensions such as Grounding, Integration, and Helpfulness.
  • Identify grounding issues — flagging unsupported claims, flawed inferences, or hallucinations.
  • Assess integration quality to ensure personal data is incorporated naturally, without robotic overnarrating.
  • Perform side-by-side (SxS) comparisons of two model responses, stack-ranking them based on overall helpfulness, usability, and quality.
  • Write clear, structured, and defensible rationales for your evaluations, referencing specific conversation turns.
  • Extract and verify debug information to confirm proper use of chat summaries and data sources.
  • Maintain data hygiene by deleting evaluation conversations to preserve the integrity of your personal account history.

Key Qualifications

  • Turkish Proficiency: High-level reading and writing ability in Turkish is required — this is the primary language for the project.
  • Personal Google Account: Willingness to use your primary personal Google account and enable relevant personal data sources for authentic evaluation.
  • Analytical Thinking: Demonstrated ability to evaluate nuanced and ambiguous AI responses with precision.
  • Prompt Engineering: Experience crafting creative, context-rich, multi-turn prompts to thoroughly test model capabilities.
  • Evaluation Acumen: Strong understanding of personalization concepts, including identifying poor inferences and forced connections.
  • Attention to Detail: Ability to spot subtle differences in response naturalness and quality during SxS evaluations.
  • Written Communication: Ability to produce clear, concise, and well-structured evaluation rationales.
  • Independence: Self-motivated and comfortable working autonomously in a remote environment.
  • Technical Setup: Reliable desktop or laptop with a strong internet connection.

Education & Experience

  • BS/BA degree or equivalent experience in a relevant field (e.g., Linguistics, Computer Science, Policy, Law, Ethics, Journalism, or a related analytical discipline).
  • Prior experience in data annotation, AI quality evaluation, or content moderation is strongly preferred.

Engagement Details

  • Rate: $15/hour
  • Type: Contractor
  • Duration: 3 months
  • Hours: Minimum 30 hrs/week (options: 30 or 40 hrs/week); at least 4 hours/day with 4-hour overlap with PST

Evaluation Process

  1. Shortlisted candidates will receive a Job Interest Form.
  2. A timed assessment will be shared and must be completed within 24 hours.
  3. Successful candidates will be contacted to discuss pre-onboarding requirements.

Go back

Related Jobs

Benture logo
See All Jobs