Data Lead

Percepta

Percepta

New York, NY, USA
Posted on Nov 27, 2025

Location

New York City

Employment Type

Full time

Location Type

On-site

Department

Engineering

Who we are

Percepta’s mission is to transform critical institutions with applied AI. We care that industries that power the world (e.g. healthcare, manufacturing, energy) benefit from frontier technology.

To make that happen, we embed with industry-leading customers to drive AI transformation. We bring together:

  • Forward-deployed expertise in engineering, product, and research

  • Mosaic, our in-house toolkit for rapidly deploying agentic workflows

  • Strategic partnerships with Anthropic, McKinsey, AWS, companies within the General Catalyst portfolio, and more

Our team is a quickly growing group of Applied AI Engineers, Embedded Product Managers, and Researchers motivated by diffusing the promise of AI into improvements we can feel in our day-to-day lives.

Percepta is a direct partnership with General Catalyst, a global transformation and investment company.

About the role

We’re hiring a Data Platform Lead / Data Architect to design, build, and operationalize the data platforms that power AI transformation across large, complex enterprises. You will lead decisions around data models, infrastructure, orchestration, quality, and governance: enabling Applied AI Engineers to ship high-impact AI workflows quickly and safely.

You’ll work hands-on inside messy, heterogeneous enterprise environments, unifying systems across cloud platforms, operational databases, legacy applications (ERPs, EHRs, and more). Your work will help create the core data and intelligence technologies that Percepta deploys across many customer environments.

If you enjoy building in ambiguity, forming strong technical opinions, owning architecture end-to-end, and enabling AI at enterprise scale, this role is for you.

In this role, you will:

  • Own the architecture of foundational data platforms that support dozens of AI workflows.

  • Shape Percepta’s data strategy across multiple customer environments, cloud providers, and system landscapes.

  • Lead cloud migration patterns, helping customers modernize their data stack while ensuring operational reliability.

  • Build reusable platform components that become playbooks for future enterprise deployments.

  • Partner directly with product teams to translate high-value use cases into crisp data requirements, schemas, and pipeline needs.

  • Work with operators, engineering teams, and leadership to turn high-value use cases into production-ready data workflows.

What you’ll do

  • Build the intelligence layer that unlocks entirely new agentic AI workflows across healthcare and other critical industries.

  • Architect and run end-to-end data platforms — from schema design to pipelines to the storage and retrieval patterns that power real-time AI.

  • Bring order to chaos by translating fragmented enterprise systems into clean, usable, high-leverage data assets.

  • Work side-by-side with AI engineers, operators, and delivery partners to ship production systems that create measurable business impact.

What we’re looking for

Strong technical foundations

  • Deep experience building pipelines on Databricks, Snowflake, or similar cloud platforms

  • SQL and Python proficiency

  • Familiarity with streaming tools (Kafka, Kinesis, etc.)

  • Strong understanding of ETL/ELT, orchestration, CI/CD for data, and schema design

  • Experience working across hybrid cloud or modernization environments

Experience building or owning enterprise data systems

  • Designed platform-level data architectures

  • Built foundational data models and semantic/intelligence layers

  • Mentored or led data engineers (formal or informal leadership)

  • Worked across distributed systems, operational DBs, and analytics warehouses

  • Led build / buy / partner processes and vendor evaluation to make strategic decisions regarding data roadmap

AI and ML intuition

  • Understanding of what ML and LLM systems need (features, context windows, retrieval patterns, embeddings)

  • Experience supporting ML pipelines or AI-adjacent data flows

Thrives in ambiguity & forward-deployed environments

  • Comfortable building in messy, fragmented enterprise systems

  • Capable of unblocking yourself with high ownership

  • Excellent communication with comfort talking directly to customer teams

  • Bias toward action while also thinking long-term

Bonus if you have

  • Prior startup or forward-deployed engineering experience

  • Experience with AWS or Azure

  • Experience leading data modernization or cloud migration initiatives

  • Background in building data platforms across multiple business units or customer environments

  • Experience working with health systems (or similar legacy enterprises) - Experience integrating ERP, EHR, CAPS, Claims, and ancillary sources into a unified data systems

Who we are

Percepta’s mission is to transform critical institutions with applied AI. We care that industries that power the world (e.g. healthcare, manufacturing, energy) benefit from frontier technology.

To make that happen, we embed with industry-leading customers to drive AI transformation. We bring together:

  • Forward-deployed expertise in engineering, product, and research

  • Mosaic, our in-house toolkit for rapidly deploying agentic workflows

  • Strategic partnerships with Anthropic, McKinsey, AWS, companies within the General Catalyst portfolio, and more

Our team is a quickly growing group of Applied AI Engineers, Embedded Product Managers, and Researchers motivated by diffusing the promise of AI into improvements we can feel in our day-to-day lives.

Percepta is a direct partnership with General Catalyst, a global transformation and investment company.