Data Engineer

Medeloop

Medeloop

Software Engineering, Data Science
San Francisco, CA, USA
Posted on Jun 18, 2025

We are seeking a Data Engineer with expertise in relational databases, healthcare data, medical terminology, ETLs, data exchange, and GCP to join Medeloop, an early-stage startup. As a Data Engineer at Medeloop, you will not only design, develop, and maintain our data infrastructure but also engage in a variety of engineering functions, offering a fantastic opportunity for exposure and growth. Working closely with our data science and product teams, you will ensure that our data architecture meets the evolving needs of the business. This role is ideal for those who are eager to help out wherever needed and are looking for great growth and leadership opportunities as we expand.

Key Responsibilities:

  • Drive the design, implementation, and maintenance of our data infrastructure on GCP
  • Develop ETL pipelines using GCP Dataflow and Cloud Storage to ingest, transform, and load data from various sources
  • Collaborate with data science and product teams to understand data requirements and develop solutions to meet those requirements
  • Develop data exchange protocols using GCP services like Cloud Functions, Cloud Workflows and Cloud Pub/Sub to facilitate the transfer of data between parties
  • Build vector databases to store and manage complex healthcare terminology using CloudSQL
  • Ensure the security and integrity of our data infrastructure by implementing appropriate security measures and data governance policies
  • Design and optimize large-scale Spark workloads on Cloud Dataproc
  • Continuously evaluate and improve our data infrastructure to ensure that it meets the evolving needs of the business and industry trends

Who You Are:

  • Bachelor's or Master's degree in Computer Science, Data Science, a related field, or equivalent experience
  • 3+ years of experience as a data engineer, preferably in the healthcare industry
  • Experience with traditional programming languages such as Python, Java, or Scala
  • Experience leading the design and implementation of data infrastructure projects on GCP
  • Experience with ETL and Big Data services and frameworks such as Cloud Dataproc, BigQuery, Apache Spark, and others.
  • Experience with GCP services like Cloud Storage, CloudSQL, Cloud Functions, Cloud Workflows
  • Experience developing cloud infrastructure using Terraform
  • Knowledge of data governance and security best practices
  • Strong analytical and problem-solving skills
  • Excellent communication and collaboration skills
  • Ability to work independently and in a team environment
  • Passion for using data to improve healthcare outcomes

Nice To Have

  • Strong knowledge of vector databases, medical terminology, and healthcare data
  • Experience with healthcare common data models and data exchange protocols such as OMOP CDM, FHIR, and others
  • Experience with AWS services (EMR, StepFunctions, Lambdas, Aurora RDS, Glue)