Healthcare Data Engineer

VeeRteq Solutions LLC

Full Timemid

Washington, District of Columbia, USPosted March 14, 2026

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

PythonSQLTerraformGitSparkCI/CDDevOps

Job Description

We are seeking a highly experienced Data Engineer to Support the design and development of a large-scale Healthcare Analytics Data Lake that integrates clinical (HL7, FHIR, CCDA) and claims (EDI 837/835) data.

The engineer will work on-site as the primary technical liaison, supporting end to end development, collaborating with offshore teams and co ordinating with client stakeholders for QA and deployment activities.

This role requires strong hands-on engineering skills, deep healthcare data expertise, and proficiency with modern cloud-native data architectures, specifically on Oracle Cloud Infrastructure (OCI).

Experience Level: 10+ Years

Location: Remote

Educational Qualification: Engineering Degree: BE/ME/BTech/Mtech/BSc/Msc

Technical Certification: multiple technologies preferred.

Mandatory Skills

10+ years in Data Engineering with strong hands-on development experience. Expert level skills in: Python, Pyspark/Spark SQL, JSON, XML, processing.
Experience with: EDI 837/835, HL7, CCDA, FHIR Delta Lake, Parquest, schema evolution.
Strong understanding of Data (modelling healthcare CDMs, Data governance, lineage, audit frameworks, Metadata-driven architectures. Data pipeline orchestration cloud & Devops.
Experience with cloud-native platforms, preferable OCI Hands-on with Git, CI/CD, Terraform, DataOps automation.
Domain knowledge familiarity with healthcare terminology standards: LOINC, SNOWMED, ICD, CPT, RxNorm.
Soft skills strong communication, client-facing presence and ability to work independently onsite. Ability to coordinate offshore development teams. Excellent documentation and technical leadership capability.

Key Responsibilities

Data pipeline development design and implement large-scale data ingestion, parsing and transformation pipelines using Python, Spark, Pyspark and SparkSQL. Build and optimize metadata-driven pipelines for flexible ingestion and transformation.
Process multi-format healthcare data including EDI 87/835, HL7 v2 , CCDA and FHIR bundles.
Cloud-Native Engineering (OCI Preferred) Develop and operate data pipelines using OCI services" OCI Data Integration, OCI Data Flow (Spark) OCI Delta Lake, OCI Autonomous Database OCI Integration Engine for parsing clinical/claims data ensure performance tuning scalability, cost, optimization and production stability.
Data Lake and Medallion Architecture build Delta Lake/Parquet based data lakes following Medallion Architecture (Bronze Silver-Gold), implement CDC, schema evolution, data quality checks and validation frameworks.
Data Modeling & Healthcare Domain Expertise Develop canonical clinical and claims data models aligned to healthcare CDMs. Map and normalize data to industry terminologies such as: LOINC SNOMED CT ICD-9/10 CPT RxNorm
Devops, DataOps & Orchestration implement CI/CD pipelines using Git, Terraform and automated deployment workflows. Develop Orchestrations/workflows with built -in data lineage, auditability, monitoring and governance. Establish DataOps best practices for automated testing. Observability and metadata management.
Onsite leadership and client coordination Act as the primary onshore engineering lead between offshore teams and client stake holders.
Facilitate handovers to QA for SIT/UAT, coordinate deployment cycles and support production readiness. Conduct architecture walkthroughs, design reviews and requirement mapping sessions.

All jobs at VeeRteq Solutions LLC →Browse Remote DevOps Engineer Jobs →