My job alerts

Researcher (Verification)

Amigo

This job is no longer accepting applications

See open jobs at Amigo.See open jobs similar to "Researcher (Verification)" General Catalyst.

Posted on May 13, 2025

Researcher (Verification)

To apply, send us your resume and anything else you'd like to careers@amigo.ai

About Amigo

We're helping enterprises build autonomous agents that reliably deliver specialized, complex services—healthcare, legal, and education—with practical precision and human-like judgment. Our mission is to build safe, reliable AI agents that organizations can genuinely depend on. We believe superhuman level agents will become an integral part of our economy over the next decade, and we've developed our own agent architecture to solve the fundamental trust problem in AI. Learn more here.

Role

As a Researcher in Verification at Amigo, you'll develop the judge systems that form the second layer of our three-layer iterative evolution approach. Working as part of our Research team, you'll create transparent evaluation frameworks that provide precise, consistent signals about agent performance across complex scenarios. Your research will combine LLM-powered reasoning with programmatic verification techniques to establish objective performance metrics that drive agent evolution. This role is critical for our Trust Center Framework (agi.work) and represents a high-priority area as the evaluation space heats up across the industry. Your work will set the standard for autonomous agent validation and enable our transition to advanced reinforcement learning approaches.

Responsibilities

Design and implement advanced verification frameworks that combine LLM-powered judge models with programmatic verification techniques

Develop judge models that employ explicit reasoning processes to provide transparent evaluation of agent performance

Create comprehensive metrics frameworks that quantify success across various dimensions of agent behavior

Research methods for consistent, unbiased evaluation that reflect genuine domain standards rather than AI biases

Build specialized verification models that may employ more powerful models or domain-specific knowledge to apply appropriate evolutionary pressure

Develop frameworks for verifiable rewards (RLVR) that can be validated by external oracles or predefined success criteria

Create systems for comprehensive coverage testing that verify regulatory compliance and safety constraints

Design effective evaluation protocols for our Trust Center Framework that provide public validation with academic rigor

Research techniques for measuring confidence across multiple simulation runs and quantifying variance in agent performance

Develop judge systems specialized for high-stakes domains like healthcare, supporting our strategic partnerships

Collaborate with Simulation researchers to create integrated evolutionary chambers with effective simulator-judge interactions

Contribute to academic publications and technical content that establish our leadership in evaluation methodologies

Work with engineering teams to implement research in production-ready verification systems

Qualifications

PhD or equivalent research experience in formal verification, AI evaluation, machine learning, or related fields

Deep understanding of evaluation methodologies for complex AI systems, particularly LLMs and agents

Research experience designing verification systems, formal specifications, or automated testing frameworks

Background in building systems that can programmatically verify outputs against defined criteria

Strong knowledge of LLM-as-judge approaches and their strengths and limitations

Experience designing comprehensive metrics that quantify performance across multiple dimensions

Understanding of statistical validation methods and confidence measurement across multiple tests

Excellent research and analytical skills with the ability to design and run rigorous experiments

Strong programming skills for implementing complex verification frameworks

Publication record in relevant conferences or journals preferred

Passion for establishing the scientific foundations of trusted AI evaluation

Location: NYC (Onsite)

To apply, send us your resume and anything else you'd like to careers@amigo.ai

This job is no longer accepting applications

See open jobs at Amigo.See open jobs similar to "Researcher (Verification)" General Catalyst.

See more open positions at Amigo

Privacy policy Cookie policy

Stay Up to Date

Thanks!

Researcher (Verification)

Researcher (Verification)