Researcher (Interpretability)

Amigo

Amigo

Posted on May 13, 2025

Researcher (Interpretability)

To apply, send us your resume and anything else you'd like to careers@amigo.ai
About Amigo
We're helping enterprises build autonomous agents that reliably deliver specialized, complex services—healthcare, legal, and education—with practical precision and human-like judgment. Our mission is to build safe, reliable AI agents that organizations can genuinely depend on. We believe superhuman level agents will become an integral part of our economy over the next decade, and we've developed our own agent architecture to solve the fundamental trust problem in AI. Learn more here.
Role
As a Researcher in Interpretability at Amigo, you'll pioneer advanced techniques for agent transparency that are essential to our Trust Center Framework and enterprise adoption in regulated industries. Working as part of our Research team, you'll develop systems that provide both contextual traceability (which messages informed a response) and fine-grained attribution within responses (linking specific parts of output to their sources). Your research will create enhanced interpretability scenarios that overcome the token bottleneck limitation while making agent reasoning fully auditable. This role is critical for establishing the scientific foundation of our trust framework and setting the industry standard for autonomous agent validation.
Responsibilities
Develop methods for capturing and analyzing model activations during response generation to enable fine-grained attribution within our Memory-Knowledge-Reasoning (M-K-R) framework
Create techniques that trace how each component of an agent response is derived from specific memory sources, knowledge bases, or reasoning patterns
Design enhanced interpretability scenarios for agent reflection, explicit reasoning models, and post-session analysis
Develop attribution methods that work across our layered memory architecture (L0, L1, L2) to provide comprehensive traceability
Create systems for analyzing LLM internal states to extract reasoning paths and attention patterns that explain agent behavior
Design transparency mechanisms that maintain performance at enterprise scale
Develop methods for visualizing attribution that make complex reasoning transparent to both technical and non-technical users
Research approaches that provide perfect context traceability while maintaining natural conversation flow
Establish scientific standards for agent attribution that can be validated with academic rigor
Collaborate with Data Scientists and the Trust Center team to implement attribution in our validation framework
Contribute to academic publications that establish our technical leadership in interpretable AI
Work with engineering teams to implement your research in production systems that satisfy regulatory requirements
Qualifications
PhD or equivalent research experience in AI interpretability, machine learning, or related fields
Deep understanding of transformer architecture internals, activation patterns, and attention mechanisms
Research experience with LLM interpretability techniques such as activation analysis, mechanistic interpretability, or attribution methods
Background working with model internals to extract reasoning patterns and attribution signals
Strong knowledge of the "token bottleneck" problem and approaches to address it
Experience designing systems that balance transparency with performance and usability
Ability to communicate complex technical concepts to both technical and non-technical audiences
Excellent research and analytical skills with the ability to design and run rigorous experiments
Strong programming skills for implementing research prototypes
Publication record in relevant interpretability or explainable AI conferences preferred
Passion for building trustworthy AI that can be deployed responsibly in high-stakes domains
Location: NYC (Onsite)
To apply, send us your resume and anything else you'd like to careers@amigo.ai