Clinical Data Scientist

Phare Health

Phare Health

Data Science
New York, NY, USA
Posted on Jul 10, 2025

About Us

Our mission is to make healthcare reimbursement transparent and fair (/phare), so providers can spend more time caring for patients and less time haggling over costs. We specifically focus on the most complex AI challenges that require novel R&D, with a team that blends AI researchers and engineers with clinicians, and payment experts. Backed by General Catalyst, we’re scaling quickly - join us!


The Role

You will join a tight-knit AI team as a hands-on data scientist and resident expert in clinical text, shipping ML systems into production and pushing forward state of the art. Expect to:

  • Prototype end-to-end text pipelines - clean and normalise raw EHR notes, choose an architecture, train, and evaluate - at the pace of days not months.

  • Train transformer models - fine-tune large language models for coding, summarisation, and clinical reasoning, then keep them fresh with continuous-learning loops.

  • Implement LLM workflows - build retrieval-augmented generation (RAG) and lightweight multi-agent chains that output clear, reference-backed answers.

  • Explore new datasets - run exploratory data analysis, map content to ICD-10, CPT and flag data gaps before modelling.

  • Productionise your work - convert research prototypes into reliable services with CI/CD, monitoring, and rollback.


Who we're looking for

  • 3+ years applying NLP or data-science to clinical (or similarly complex) text.

  • Proven ability to take a project from EDA → model design → evaluation → production code in Python (SQL, Pandas, modern ML/NLP libraries).

  • Hands-on experience training transformer models and building RAG or agent-based LLM pipelines.

  • Familiar with EHR formats and healthcare ontologies (ICD-10, CPT, LOINC, SNOMED).

  • Track record operating production-grade ML systems with monitoring and uptime targets.

Bonus points

  • Peer-reviewed publications or open-source contributions in clinical NLP.

  • Experience with reinforcement-learning methods such as GRPO (or similar policy-optimisation techniques) for model refinement.

  • Experience in customer-facing roles communicating data science requirements and gathering specs from end users.

Benefits

  • Top-of-market compensation (salary + equity)

  • Flexible PTO & hybrid culture (SoHo HQ 3 days/wk; exceptional remote considered)

  • Mission-driven, collaborative team

  • Twice-yearly offsites to align, build, and celebrate.


Hiring Process

  1. Initial application.

  2. Intro call: Discuss your background, career goals, and our mission.

  3. 2 x Technical interviews: A programming or system design exercise focused on real-world data challenges.

  4. Referees: We ask for 2 referees who can speak to your professional/technical work

  5. Culture interview: Ways of working, and a chance to ask questions

  6. Offer