My job alerts

Researcher (Interpretability)

Amigo

This job is no longer accepting applications

See open jobs at Amigo.See open jobs similar to "Researcher (Interpretability)" General Catalyst.

Posted on May 13, 2025

Researcher (Interpretability)

To apply, send us your resume and anything else you'd like to careers@amigo.ai

About Amigo

We're helping enterprises build autonomous agents that reliably deliver specialized, complex services—healthcare, legal, and education—with practical precision and human-like judgment. Our mission is to build safe, reliable AI agents that organizations can genuinely depend on. We believe superhuman level agents will become an integral part of our economy over the next decade, and we've developed our own agent architecture to solve the fundamental trust problem in AI. Learn more here.

Role

As a Researcher in Interpretability at Amigo, you'll pioneer advanced techniques for agent transparency that are essential to our Trust Center Framework and enterprise adoption in regulated industries. Working as part of our Research team, you'll develop systems that provide both contextual traceability (which messages informed a response) and fine-grained attribution within responses (linking specific parts of output to their sources). Your research will create enhanced interpretability scenarios that overcome the token bottleneck limitation while making agent reasoning fully auditable. This role is critical for establishing the scientific foundation of our trust framework and setting the industry standard for autonomous agent validation.

Responsibilities

Develop methods for capturing and analyzing model activations during response generation to enable fine-grained attribution within our Memory-Knowledge-Reasoning (M-K-R) framework

Create techniques that trace how each component of an agent response is derived from specific memory sources, knowledge bases, or reasoning patterns

Design enhanced interpretability scenarios for agent reflection, explicit reasoning models, and post-session analysis

Develop attribution methods that work across our layered memory architecture (L0, L1, L2) to provide comprehensive traceability

Create systems for analyzing LLM internal states to extract reasoning paths and attention patterns that explain agent behavior

Design transparency mechanisms that maintain performance at enterprise scale

Develop methods for visualizing attribution that make complex reasoning transparent to both technical and non-technical users

Research approaches that provide perfect context traceability while maintaining natural conversation flow

Establish scientific standards for agent attribution that can be validated with academic rigor

Collaborate with Data Scientists and the Trust Center team to implement attribution in our validation framework

Contribute to academic publications that establish our technical leadership in interpretable AI

Work with engineering teams to implement your research in production systems that satisfy regulatory requirements

Qualifications

PhD or equivalent research experience in AI interpretability, machine learning, or related fields

Deep understanding of transformer architecture internals, activation patterns, and attention mechanisms

Research experience with LLM interpretability techniques such as activation analysis, mechanistic interpretability, or attribution methods

Background working with model internals to extract reasoning patterns and attribution signals

Strong knowledge of the "token bottleneck" problem and approaches to address it

Experience designing systems that balance transparency with performance and usability

Ability to communicate complex technical concepts to both technical and non-technical audiences

Excellent research and analytical skills with the ability to design and run rigorous experiments

Strong programming skills for implementing research prototypes

Publication record in relevant interpretability or explainable AI conferences preferred

Passion for building trustworthy AI that can be deployed responsibly in high-stakes domains

Location: NYC (Onsite)

To apply, send us your resume and anything else you'd like to careers@amigo.ai

This job is no longer accepting applications

See open jobs at Amigo.See open jobs similar to "Researcher (Interpretability)" General Catalyst.

See more open positions at Amigo

Privacy policy Cookie policy

Stay Up to Date

Thanks!

Researcher (Interpretability)

Researcher (Interpretability)