Skip to main content
O

Senior Data Engineer (Data Foundations & Pipeline Automation)

OmniGTM.ai
Full TimeseniorHybrid
Richmond, British Columbia, CAPosted March 5, 2026

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

PythonGoSQLAWSPostgreSQLMongoDBSparkAirflowCI/CD

Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score

Job Description

Senior Data Engineer (Data Foundations & Pipeline Automation) — OmniGTM.ai

Position Type: Full-time

Location: Hybrid, on-site (Richmond, BC)

About OmniGTM.ai

AEPG is building an AI-native Go-To-Market (GTM) platform designed to eliminate operational silos across marketing, sales, and customer success. OmniGTM.ai unifies CRM data, third-party enrichment, intent signals, user interactions, and activation workflows into a single, governed source of truth that enterprise teams can trust.

As a foundational engineering hire, you will directly influence the platform’s data architecture, pipeline automation standards, and database change management practices that enable reliable AI-driven workflows at scale.

Job Summary

We are hiring a Senior Data Engineer to design, implement, and operate OmniGTM.ai’s end-to-end data foundation—including automated pipeline orchestration and database change automation using Liquibase. You will own production-grade ingestion, transformation, CDC/streaming patterns, schema evolution, and observability across a polyglot stack (MongoDB Atlas + Amazon Aurora PostgreSQL + AWS services).

This role expects you to treat data pipelines and database migrations as first-class software: version-controlled, test-gated, repeatable across Dev/Stage/Prod, and resilient under real-world change.

Key Responsibilities

1) Automated Data Pipelines (Batch + Near-Real-Time)

  • Design and operate high-throughput ETL/ELT pipelines processing CRM objects, enrichment provider feeds, intent signals, product usage events, and activation outcomes.
  • Build distributed processing workflows using PySpark/Spark, EMR (or equivalent), and lakehouse patterns (Delta/Iceberg/Hudi as appropriate).
  • Implement CDC and incremental ingestion patterns (e.g., log-based capture, timestamp-based deltas, idempotent replay, late-arriving data handling).
  • Use orchestration tooling such as Airflow/MWAA, AWS Step Functions, AWS Glue, or an equivalent orchestrator to ensure reliability and traceability.

2) Database Change Automation (Liquibase as a Standard)

  • Establish and enforce schema-as-code practices for Aurora PostgreSQL using Liquibase: versioned changelogs, rollback strategy, and environment promotion.
  • Build CI/CD gates for database changes: lint/validate, dry-run, diff checks, migration tests, and automated deployment to Dev/Stage/Prod.
  • Define safe patterns for schema evolution: backward compatible changes, feature-flagged rollout, dual-write/dual-read when needed, and zero/near-zero downtime migrations.
  • Maintain a clear boundary between schema migrations (Liquibase) vs data backfills and bulk migrations (pipeline jobs), while supporting controlled reference/seed data where appropriate.

3) Unified Data Foundation & Data Contracts

  • Own schema design and data modeling across MongoDB Atlas and Aurora PostgreSQL, aligned to a unified GTM data model.
  • Implement data contracts and validation: required fields, allowed values, nullability policies, and compatibility rules for producers/consumers.
  • Define a pragmatic strategy for “source of truth” across systems (CRM vs enrichment vs inferred/AI-enriched fields), including conflict resolution and lineage.

4) AI/ML Enablement (Data for Scoring, Personalization, and Analytics)

  • Build datasets and feature pipelines to support predictive scoring, segmentation, propensity models, and personalization loops.
  • Provide AI teams with high-quality, explainable, low-latency data for training, offline evaluation, and production inference.
  • Create feedback loops from activation outcomes (opens/clicks/replies/conversions) back into scoring and recommendation systems.

5) Governance, Observability, and Cost/Performance Optimization

  • Implement monitoring and alerting for pipeline health, freshness, correctness, and SLA adherence (CloudWatch/X-Ray and/or equivalents).
  • Define data quality checks (duplicates, drift, anomalies, referential integrity), and operational runbooks for incidents.
  • Continuously optimize compute and storage: Spark tuning, partitioning strategy, indexing, query plans, and cost controls across AWS + databases.

6) Collaboration and Technical Leadership

  • Partner with Solution Architecture, AI Engineering, and Product leadership to align data systems with business workflows and GTM lifecycle requirements.
  • Mentor engineers on modern data engineering practices: reproducibility, idempotency, schema evolution, CI/CD for data, and operational excellence.
  • Influence platform strategy and architectural decisions with clear tradeoffs and written decision records.

Required Qualifications

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or related field (or equivalent experience).
  • 7+ years of experience building and operating production-scale data pipelines.
  • Strong Python skills for pipeline development, automation, and services.
  • Strong SQL skills and hands-on experience with relational systems (preferably Aurora PostgreSQL).
  • Demonstra

Want AI-powered job matching?

Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.

Get Started Free