Loading, Please wait..!!

AI Data Architect

Posted on: Apr 16, 2026

Genzeon
Exton, Pennsylvania
Salary: Not Available
Full-time

Save this job

Apply Here Save this job

AI Data Architect

Job Title :

AI Data Architect

Job Type :

Full-time

Job Location :

Exton Pennsylvania United States

Remote :

Job Description :

Job Description

AI Data Architect | Healthcare AI Platform

Genzeon Corporation — Healthcare Division

Exton, PA / Hybrid | 0–4 years | Full-time

AI native Product Architect-Exp in data engineering needed for product build out

The short version: We run a multi-model AI pipeline that processes 150K Medicare documents/year — faxed PDFs, EDI transactions, FHIR data, clinical notes. You’ll design and build the data architecture that ingests, stores, governs, and serves all of it to AI models and clinical reviewers. On-prem GPUs, hybrid cloud, HIPAA compliance. This is the real thing.

What you’ll do:

Design the end-to-end data architecture for a healthcare AI platform — ingestion,storage, processing, serving, governance Build pipelines for heterogeneous healthcare data: faxed PDFs, X12 EDI (835/837/278),FHIR R4, HL7v2, CMS files, unstructured clinical notes Architect the data lake/lakehouse layer (Apache Iceberg, MinIO, DuckDB,PostgreSQL/pgvector)

Design the embedding and vector storage layer that powers RAG — chunking, indexing, retrieval optimization Build data lineage tracking from source document to AI decision

Implement HIPAA/HITRUST data governance — encryption, access controls, audit logging, PHI handling Monitor data quality across the pipeline — schema drift, completeness, freshness, anomalies

Optimize for hybrid infrastructure: on-prem GPUs (RTX 5090, L40S), NAS, Azure GovCloud, Azure Commercial

What you need:

A data pipeline you’ve built that ran in production (we’ll ask about it)

SQL fluency and Python proficiency

Experience with at least one of: Spark, dbt, Airflow, Dagster, Prefect

Hands-on work with unstructured or semi-structured data — PDFs, images, OCR outputs, free text

Practical understanding of vector databases, embeddings, and how RAG systems consume data

Comfort with on-premises infrastructure, not just managed cloud services

Data quality and governance as instincts, not afterthoughts

Strong signals:

Healthcare data formats (X12 EDI, FHIR, HL7, CCD/C-CDA)

Apache Iceberg, Delta Lake, or modern table formats

MinIO / S3 / object storage architecture

pgvector, Pinecone, Weaviate, or similar vector stores

DuckDB or embedded analytical engines

HIPAA technical safeguards implementation

ML data pipelines — training data, feature stores, evaluation sets, feedback loops

We don’t require:

A data engineering bootcamp cert

Mastery of the entire “modern data stack”

Prior healthcare experience (but it helps)

A specific degree

To apply, submit:

1. Resume

2. Link to a data project you’ve built (GitHub, architecture diagram, write-up)

3. 200 words max: “Describe the messiest data problem you’ve encountered. How did you

solve it?”

View Full Description

Position Details

Posted:

Apr 16, 2026

Reference Number:

4cf389e72a6483eb

Employment:

Full-time

Salary:

Not Available

City:

Exton

Job Origin:

ziprecruiter

Share this job:

AI Data Architect Apply

Click on the below icons to share this job to Linkedin, Twitter!

Job Description

AI Data Architect | Healthcare AI Platform

Genzeon Corporation — Healthcare Division

Exton, PA / Hybrid | 0–4 years | Full-time

AI native Product Architect-Exp in data engineering needed for product build out

What you’ll do:

Design the embedding and vector storage layer that powers RAG — chunking, indexing, retrieval optimization Build data lineage tracking from source document to AI decision

Implement HIPAA/HITRUST data governance — encryption, access controls, audit logging, PHI handling Monitor data quality across the pipeline — schema drift, completeness, freshness, anomalies

Optimize for hybrid infrastructure: on-prem GPUs (RTX 5090, L40S), NAS, Azure GovCloud, Azure Commercial

What you need:

A data pipeline you’ve built that ran in production (we’ll ask about it)

SQL fluency and Python proficiency

Experience with at least one of: Spark, dbt, Airflow, Dagster, Prefect

Hands-on work with unstructured or semi-structured data — PDFs, images, OCR outputs, free text

Practical understanding of vector databases, embeddings, and how RAG systems consume data

Comfort with on-premises infrastructure, not just managed cloud services

Data quality and governance as instincts, not afterthoughts

Strong signals:

Healthcare data formats (X12 EDI, FHIR, HL7, CCD/C-CDA)

Apache Iceberg, Delta Lake, or modern table formats

MinIO / S3 / object storage architecture

pgvector, Pinecone, Weaviate, or similar vector stores

DuckDB or embedded analytical engines

HIPAA technical safeguards implementation

ML data pipelines — training data, feature stores, evaluation sets, feedback loops

We don’t require:

A data engineering bootcamp cert

Mastery of the entire “modern data stack”

Prior healthcare experience (but it helps)

A specific degree

To apply, submit:

1. Resume

2. Link to a data project you’ve built (GitHub, architecture diagram, write-up)

3. 200 words max: “Describe the messiest data problem you’ve encountered. How did you

solve it?”

Please wait..!!

AI Data Architect

Apply Here Save this job

AI Data Architect

Job Title :

Job Type :

Job Location :

Remote :

Job Description :

Job Description

View Full Description

Position Details

Posted:

Reference Number:

Employment:

Salary:

City:

Job Origin:

Share this job:

For Employers

For Partner

For Jobseekers

Help

Follow Us

snaprecruit

AI Data Architect Apply

Job Description

Please input your account's email

Find Your Next Job In A Snap!

Recent Searches

Surprising !! No peanuts found :)