
Mercor is partnering with an AI lab on a project that teaches robots to understand and perform everyday household tasks — like putting away dishes, folding clothes, or tidying a room.
We're looking for detail-oriented freelance contributors to watch first-person videos of people completing household activities and break them down into precise, labeled action segments. Your annotations will directly train the next generation of robots to understand what tasks like "pick up a mug" or "walk to the counter" actually look like in the real world.
Watch egocentric (first-person) videos of people performing household tasks like cleaning, organizing, and tidying rooms.
Segment each video into individual actions — identifying exactly when each action starts and ends, down to a fraction of a second.
Write short, natural language descriptions for each action (e.g., "Pick up the blue towel and place it on the shelf").
Label each segment with the correct action type and which hand(s) are being used.
Screen videos for quality issues before annotating.
Follow detailed project guidelines to ensure consistent, high-quality training data.
Strong attention to detail — you'll be placing timestamps with frame-level precision and catching subtle differences between similar actions.
Good judgment and common sense — many decisions require interpreting guidelines and applying them to ambiguous real-world scenarios, not just following a checklist.
Clear, concise writing in English — you'll write short action descriptions that need to sound natural and specific. Native or near-native English fluency required.
Comfort with repetitive, focused work — a typical video may have 20–100+ action segments, each requiring careful review.
Basic computer proficiency — you'll use a browser-based annotation tool with keyboard shortcuts. Comfort learning new software quickly is a plus.
No prior annotation experience is required, but experience with video editing, data labeling, transcription, or similar detail-oriented media work is a plus.
Participate in a short AI interview (~20 minutes)
Complete a screening quiz (~15 to 20 minutes) that tests attention to detail, judgment, and writing skills.
Top candidates will be extended offers in a soon as <1 day.
You can apply if you’re based in:
United States (except residents of California, New York, Connecticut, Washington, or the District of Columbia)
Canada