Clinical Data Scientist

Phare Health

Phare Health

Data Science
New York Metropolitan Area, USA
Posted on Jul 23, 2025

About Us

Our mission is to make healthcare reimbursement transparent and fair (/phare), so providers can spend more time caring for patients and less time haggling over costs. We specifically focus on the most complex AI challenges that require novel R&D, with a team that blends AI researchers and engineers with clinicians, and payment experts. Backed by top healthcare investors including General Catalyst, we’re scaling quickly - join us!

The Role

You will join a tight-knit AI team as a hands-on data scientist and resident expert in clinical text, shipping ML systems into production and pushing forward state of the art. Expect to:

  • Prototype end-to-end text pipelines - clean and normalise raw EHR notes, choose an architecture, train, and evaluate - at the pace of days not months.
  • Train transformer models - fine-tune large language models for coding, summarisation, and clinical reasoning, then keep them fresh with continuous-learning loops.
  • Implement LLM workflows - build retrieval-augmented generation (RAG) and lightweight multi-agent chains that output clear, reference-backed answers.
  • Explore new datasets - run exploratory data analysis, map content to ICD-10, CPT and flag data gaps before modelling.
  • Productionise your work - convert research prototypes into reliable services with CI/CD, monitoring, and rollback.

Who we're looking for

  • 3+ years applying NLP or data-science to clinical (or similarly complex) text.
  • Proven ability to take a project from EDA → model design → evaluation → production code in Python (SQL, Pandas, modern ML/NLP libraries).
  • Hands-on experience training transformer models and building RAG or agent-based LLM pipelines.
  • Familiar with EHR formats and healthcare ontologies (ICD-10, CPT, LOINC, SNOMED).
  • Track record operating production-grade ML systems with monitoring and uptime targets.

Bonus points

  • Peer-reviewed publications or open-source contributions in clinical NLP.
  • Experience with reinforcement-learning methods such as GRPO (or similar policy-optimisation techniques) for model refinement.
  • Experience in customer-facing roles communicating data science requirements and gathering specs from end users.

Benefits

  • Top-of-market compensation (salary + equity)
  • Flexible PTO & hybrid culture (SoHo HQ 3 days/wk; exceptional remote considered)
  • Mission-driven, collaborative team
  • Twice-yearly offsites to align, build, and celebrate.

Hiring Process

  1. Initial application.
  2. Intro call: Discuss your background, career goals, and our mission.
  3. 2 x Technical interviews: A programming or system design exercise focused on real-world data challenges.
  4. Referees: We ask for 2 referees who can speak to your professional/technical work
  5. Culture interview: Ways of working, and a chance to ask questions
  6. Offer