Qualitative Evaluation Engineer
Luma AI
San Francisco, CA, USA
- Evaluate generative model performance across diverse tasks, prompts, and modalities.
- Identify key failure modes, regression patterns, and edge cases that impact product quality.
- Develop and maintain qualitative evaluation frameworks that are scalable and reusable.
- Collaborate closely with technical artists and engineers to align evaluations with model capabilities and target use cases.
- Translate high-level product goals into concrete evaluative criteria.
- Lead qualitative studies, side-by-side comparisons, and human-in-the-loop evaluation efforts.
- Provide detailed feedback that informs model fine-tuning, dataset curation, and product UX.
- Stay informed about emerging evaluation standards in generative AI and creative tools.
- Master’s degree or higher in Cognitive Science, Human-Computer Interaction (HCI), Design Research, Psychology, Media Studies, or a related field.
- 5+ years of experience in product evaluation, UX research, model testing, or similar roles that involve structured qualitative assessment.
- Deep familiarity with creative workflows and real-world use cases for generative models (e.g., animation, filmmaking, digital art, VFX).
- Strong systems thinking and the ability to define abstract qualities (like believability, identity retention, or scene coherence) in clear evaluative terms.
- Experience working cross-functionally with engineers, researchers, and creatives.
- Excellent written communication skills and the ability to synthesize nuanced judgments into clear, actionable insights.
- Background in motion, visual effects, or storytelling pipelines
- Experience evaluating AI-generated media (video, images, 3D)
- Prior work on building internal tools for qualitative data collection or scoring
- Familiarity with prompt engineering and reference-based input methods