Remote Cloud Data Engineer (Must have experience with Pyspark) with Security

Advantech GS Enterprises, Inc.

Contract mid

Fort Meade, Maryland, USPosted 3 days ago

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

PythonSQLAWSAzureRESTSparkAgileCI/CDDevOpsAPI

Job Description

Data Cloud Engineer Location: Fort Meade, MD (remote)

Company: Advantech GS Enterprises Program: DISA NEXUS

Clearance Required: Active Secret Clearance

Employment Type: Full-Time

Bonus: $1,000 Sign-On Bonus Position Overview Advantech GS Enterprises is seeking a highly skilled Data Cloud Engineer to support the DISA NEXUS program at Fort Meade. This role will focus on designing, building, and maintaining scalable cloud-based data solutions supporting enterprise modernization and mission-critical analytics within secure DoD environments. The ideal candidate will have strong experience developing production-grade data pipelines within Azure and/or AWS cloud environments, with expertise in PySpark, Spark SQL, Python, and modern data engineering best practices.

This position offers the opportunity to contribute to a long-term federal modernization effort centered around multi-cloud integration, secure data architecture, and advanced analytics capabilities. Key Responsibilities

Design, develop, and maintain scalable cloud-based ETL/ELT pipelines using Azure Synapse Analytics, Databricks, AWS Glue, and related technologies

Build and optimize large-scale data transformations using PySpark and Spark SQL, applying best practices for partitioning, query optimization, and performance tuning

Develop and support data ingestion frameworks for both structured data (relational tables, CSV files) and unstructured data (JSON, nested structures, REST API integrations)

Implement full and incremental data loading strategies, including change data capture (CDC), late-arriving record handling, and rerunnable pipelines

Design and maintain cloud-based data lakes, warehouses, and analytics-ready datasets supporting enterprise reporting and operational decision-making

Implement data quality and governance controls including schema validation, schema enforcement, schema drift handling, RBAC, lineage, cataloging, and credential management

Monitor and troubleshoot pipelines for latency, failures, logging, alerting, and operational reliability

Support CI/CD pipeline implementation for automated deployments, rollback strategies, and environment promotion processes

Collaborate with cybersecurity and cloud engineering teams to ensure compliance with RMF, STIG, FedRAMP, and DoD security standards

Utilize Oracle databases and cloud-native tools to support data migration, integration, and modernization initiatives

Support Agile development efforts and collaborate with DevOps and software engineering teams across the program lifecycle

Required Qualifications

Active Secret Clearance required

Bachelor’s degree in Computer Science, Data Science, Engineering, Information Systems, or related technical field

5+ years of recent experience designing and operating scalable, production-grade data pipelines using Azure Synapse Analytics and/or Databricks

Strong hands-on experience with PySpark and Spark SQL for large-scale transformations and optimization

Advanced proficiency in Python and SQL for data querying, automation, and analysis

Experience ingesting and integrating structured and unstructured datasets from databases, flat files, REST APIs, and external systems

Experience implementing full and incremental load strategies including CDC concepts and rerunnable pipeline architectures

Experience with data quality controls including schema validation, enforcement, and schema drift handling

Experience with pipeline monitoring, logging, alerting, and operational support

Experience implementing CI/CD pipelines and automated deployment processes

Knowledge of data governance and secure access management concepts including RBAC and credential management

Hands-on experience with Azure and/or AWS cloud data services such as Azure Synapse, Azure Data Factory, Databricks, AWS Glue, Redshift, and S3

About Advantech GS Enterprises, Inc.

Advantech GS Enterprises, Inc.

SecurityOn-site

All open roles at Advantech GS Enterprises, Inc.

Browse Cyber Security Jobs →

Similar Jobs

Receptionist and Administration Assistant

Chandan Tech Solutions Pvt ltd

Maharashtra, IN

Cloud Platform Engineer

Accenture Federal Services

Chantilly, Virginia, US

Principal Cloud Infrastructure Engineer (AWS)

CVS Health

Woonsocket, Rhode Island, US

Google Cloud Platform Engineer

LIGHTFEATHER IO LLC

Washington, District of Columbia, US

shift supervisor - Store# 09319, INTERURBAN AVENUE

Starbucks

Washington, District of Columbia, US

Want AI-powered job matching?

Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.

Get Started Free