Researcher (Reinforcement Learning)

Amigo

Amigo

Posted on May 13, 2025

Researcher (Reinforcement Learning)

To apply, send us your resume and anything else you'd like to careers@amigo.ai
About Amigo
We're helping enterprises build autonomous agents that reliably deliver specialized, complex services—healthcare, legal, and education—with practical precision and human-like judgment. Our mission is to build safe, reliable AI agents that organizations can genuinely depend on. We believe superhuman level agents will become an integral part of our economy over the next decade, and we've developed our own agent architecture to solve the fundamental trust problem in AI. Learn more here.
Role
As a Researcher in Reinforcement Learning at Amigo, you'll lead our transition from scaffolded architecture to advanced RL approaches that enable agents to evolve toward superhuman performance. Working as part of our Research team, you'll develop the evolutionary chamber frameworks that create strategically calibrated pressures for continuous agent improvement. Your research will optimize the unified Memory-Knowledge-Reasoning (M-K-R) cycle, establishing efficient, increasingly recursive improvement systems that evolve under the evolutionary pressure of simulators and judges. This role is critical for our 6-12 month strategic goal of implementing recursion algorithms for agent-building-agents and setting the industry standard for trusted agent development.
Responsibilities
Design evolutionary chamber frameworks that create precisely calibrated pressures for agent improvement aligned with strategic goals
Develop reinforcement learning approaches that optimize the integrated M-K-R cycle rather than applying RL broadly
Create techniques for RLVR (Reinforcement Learning with Verifiable Rewards) that use outcome-based rewards verifiable by external oracles
Implement frameworks for iterative distillation and amplification that systematically improve capabilities through cycles of amplification and distillation
Research methods for self-play reasoning that allow agents to generate and solve their own tasks with decreasing human intervention
Design multi-stage learning approaches that build upon our context engine, memory systems, and evaluation framework
Develop strategic resource allocation methodologies that enable different confidence requirements for different domains
Create frameworks that balance the exploration of novel strategies with the exploitation of proven approaches
Research methods for implementing recursion algorithms that enable agent-building-agents capabilities
Design approaches for targeted capability enhancement that focus computational resources where they create maximum value
Collaborate with Simulation and Verification researchers to create integrated evolutionary chambers
Work with Agent researchers on solving the neighborhood expansion problem through reinforcement learning
Contribute to academic publications that establish our technical leadership in advanced agent development
Partner with ML Engineers to implement research in production-ready training systems
Qualifications
PhD or equivalent research experience in reinforcement learning, AI, machine learning, or related fields
Deep understanding of reinforcement learning techniques, particularly in the context of language models
Research experience with RLHF (Reinforcement Learning from Human Feedback) or related approaches
Background in designing reward functions, training protocols, or evaluation frameworks for AI
Strong knowledge of techniques for iterative distillation, self-play, or recursive improvement
Experience with efficient resource allocation in computationally intensive learning systems
Understanding of the practical constraints of RL in production environments
Excellent research and analytical skills with the ability to design and run rigorous experiments
Strong programming skills for implementing complex reinforcement learning systems
Publication record in relevant conferences or journals preferred
Passion for advancing AI capabilities while maintaining perfect alignment and reliability
Location: NYC (Onsite)
To apply, send us your resume and anything else you'd like to careers@amigo.ai