Clinical Data Scientist

Phare Health

Phare Health

Data Science
New York, NY, USA
Posted on Jul 23, 2025

About Us

Our mission is to make healthcare reimbursement transparent and fair (/phare), so providers can spend more time caring for patients and less time haggling over costs. We specifically focus on the most complex AI challenges that require novel R&D, with a team that blends AI researchers and engineers with clinicians, and payment experts. Backed by top healthcare investors including General Catalyst, we’re scaling quickly - join us!

The Role

You will join a tight-knit AI team as a hands-on data scientist and resident expert in clinical text, shipping ML systems into production and pushing forward state of the art. Expect to:

  • Prototype end-to-end text pipelines - clean and normalise raw EHR notes, choose an architecture, train, and evaluate - at the pace of days not months.
  • Train transformer models - fine-tune large language models for coding, summarisation, and clinical reasoning, then keep them fresh with continuous-learning loops.
  • Implement LLM workflows - build retrieval-augmented generation (RAG) and lightweight multi-agent chains that output clear, reference-backed answers.
  • Explore new datasets - run exploratory data analysis, map content to ICD-10, CPT and flag data gaps before modelling.
  • Productionise your work - convert research prototypes into reliable services with CI/CD, monitoring, and rollback.

Who we're looking for

  • 3+ years applying NLP or data-science to clinical (or similarly complex) text.
  • Proven ability to take a project from EDA → model design → evaluation → production code in Python (SQL, Pandas, modern ML/NLP libraries).
  • Hands-on experience training transformer models and building RAG or agent-based LLM pipelines.
  • Familiar with EHR formats and healthcare ontologies (ICD-10, CPT, LOINC, SNOMED).
  • Track record operating production-grade ML systems with monitoring and uptime targets.

Bonus points

  • Peer-reviewed publications or open-source contributions in clinical NLP.
  • Experience with reinforcement-learning methods such as GRPO (or similar policy-optimisation techniques) for model refinement.
  • Experience in customer-facing roles communicating data science requirements and gathering specs from end users.

Benefits

  • Top-of-market compensation (salary + equity)
  • Flexible PTO & hybrid culture (SoHo HQ 3 days/wk; exceptional remote considered)
  • Mission-driven, collaborative team
  • Twice-yearly offsites to align, build, and celebrate.

Hiring Process

  1. Initial application.
  2. Intro call: Discuss your background, career goals, and our mission.
  3. 2 x Technical interviews: A programming or system design exercise focused on real-world data challenges.
  4. Referees: We ask for 2 referees who can speak to your professional/technical work
  5. Culture interview: Ways of working, and a chance to ask questions
  6. Offer