Clinical Data Scientist
Phare Health
Data Science
New York Metropolitan Area, USA
Posted on Jul 23, 2025
About Us
Our mission is to make healthcare reimbursement transparent and fair (/phare), so providers can spend more time caring for patients and less time haggling over costs. We specifically focus on the most complex AI challenges that require novel R&D, with a team that blends AI researchers and engineers with clinicians, and payment experts. Backed by top healthcare investors including General Catalyst, we’re scaling quickly - join us!
The Role
You will join a tight-knit AI team as a hands-on data scientist and resident expert in clinical text, shipping ML systems into production and pushing forward state of the art. Expect to:
- Prototype end-to-end text pipelines - clean and normalise raw EHR notes, choose an architecture, train, and evaluate - at the pace of days not months.
- Train transformer models - fine-tune large language models for coding, summarisation, and clinical reasoning, then keep them fresh with continuous-learning loops.
- Implement LLM workflows - build retrieval-augmented generation (RAG) and lightweight multi-agent chains that output clear, reference-backed answers.
- Explore new datasets - run exploratory data analysis, map content to ICD-10, CPT and flag data gaps before modelling.
- Productionise your work - convert research prototypes into reliable services with CI/CD, monitoring, and rollback.
Who we're looking for
- 3+ years applying NLP or data-science to clinical (or similarly complex) text.
- Proven ability to take a project from EDA → model design → evaluation → production code in Python (SQL, Pandas, modern ML/NLP libraries).
- Hands-on experience training transformer models and building RAG or agent-based LLM pipelines.
- Familiar with EHR formats and healthcare ontologies (ICD-10, CPT, LOINC, SNOMED).
- Track record operating production-grade ML systems with monitoring and uptime targets.
Bonus points
- Peer-reviewed publications or open-source contributions in clinical NLP.
- Experience with reinforcement-learning methods such as GRPO (or similar policy-optimisation techniques) for model refinement.
- Experience in customer-facing roles communicating data science requirements and gathering specs from end users.
Benefits
- Top-of-market compensation (salary + equity)
- Flexible PTO & hybrid culture (SoHo HQ 3 days/wk; exceptional remote considered)
- Mission-driven, collaborative team
- Twice-yearly offsites to align, build, and celebrate.
Hiring Process
- Initial application.
- Intro call: Discuss your background, career goals, and our mission.
- 2 x Technical interviews: A programming or system design exercise focused on real-world data challenges.
- Referees: We ask for 2 referees who can speak to your professional/technical work
- Culture interview: Ways of working, and a chance to ask questions
- Offer