Senior Data Engineer (Data Foundations & Pipeline Automation)
OmniGTM.aiResume Keywords to Include
Make sure these keywords appear in your resume to improve ATS scoring
Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score
Job Description
Senior Data Engineer (Data Foundations & Pipeline Automation) — OmniGTM.ai
Position Type: Full-time
Location: Hybrid, on-site (Richmond, BC)
About OmniGTM.ai
AEPG is building an AI-native Go-To-Market (GTM) platform designed to eliminate operational silos across marketing, sales, and customer success. OmniGTM.ai unifies CRM data, third-party enrichment, intent signals, user interactions, and activation workflows into a single, governed source of truth that enterprise teams can trust.
As a foundational engineering hire, you will directly influence the platform’s data architecture, pipeline automation standards, and database change management practices that enable reliable AI-driven workflows at scale.
Job Summary
We are hiring a Senior Data Engineer to design, implement, and operate OmniGTM.ai’s end-to-end data foundation—including automated pipeline orchestration and database change automation using Liquibase. You will own production-grade ingestion, transformation, CDC/streaming patterns, schema evolution, and observability across a polyglot stack (MongoDB Atlas + Amazon Aurora PostgreSQL + AWS services).
This role expects you to treat data pipelines and database migrations as first-class software: version-controlled, test-gated, repeatable across Dev/Stage/Prod, and resilient under real-world change.
Key Responsibilities
1) Automated Data Pipelines (Batch + Near-Real-Time)
- Design and operate high-throughput ETL/ELT pipelines processing CRM objects, enrichment provider feeds, intent signals, product usage events, and activation outcomes.
- Build distributed processing workflows using PySpark/Spark, EMR (or equivalent), and lakehouse patterns (Delta/Iceberg/Hudi as appropriate).
- Implement CDC and incremental ingestion patterns (e.g., log-based capture, timestamp-based deltas, idempotent replay, late-arriving data handling).
- Use orchestration tooling such as Airflow/MWAA, AWS Step Functions, AWS Glue, or an equivalent orchestrator to ensure reliability and traceability.
2) Database Change Automation (Liquibase as a Standard)
- Establish and enforce schema-as-code practices for Aurora PostgreSQL using Liquibase: versioned changelogs, rollback strategy, and environment promotion.
- Build CI/CD gates for database changes: lint/validate, dry-run, diff checks, migration tests, and automated deployment to Dev/Stage/Prod.
- Define safe patterns for schema evolution: backward compatible changes, feature-flagged rollout, dual-write/dual-read when needed, and zero/near-zero downtime migrations.
- Maintain a clear boundary between schema migrations (Liquibase) vs data backfills and bulk migrations (pipeline jobs), while supporting controlled reference/seed data where appropriate.
3) Unified Data Foundation & Data Contracts
- Own schema design and data modeling across MongoDB Atlas and Aurora PostgreSQL, aligned to a unified GTM data model.
- Implement data contracts and validation: required fields, allowed values, nullability policies, and compatibility rules for producers/consumers.
- Define a pragmatic strategy for “source of truth” across systems (CRM vs enrichment vs inferred/AI-enriched fields), including conflict resolution and lineage.
4) AI/ML Enablement (Data for Scoring, Personalization, and Analytics)
- Build datasets and feature pipelines to support predictive scoring, segmentation, propensity models, and personalization loops.
- Provide AI teams with high-quality, explainable, low-latency data for training, offline evaluation, and production inference.
- Create feedback loops from activation outcomes (opens/clicks/replies/conversions) back into scoring and recommendation systems.
5) Governance, Observability, and Cost/Performance Optimization
- Implement monitoring and alerting for pipeline health, freshness, correctness, and SLA adherence (CloudWatch/X-Ray and/or equivalents).
- Define data quality checks (duplicates, drift, anomalies, referential integrity), and operational runbooks for incidents.
- Continuously optimize compute and storage: Spark tuning, partitioning strategy, indexing, query plans, and cost controls across AWS + databases.
6) Collaboration and Technical Leadership
- Partner with Solution Architecture, AI Engineering, and Product leadership to align data systems with business workflows and GTM lifecycle requirements.
- Mentor engineers on modern data engineering practices: reproducibility, idempotency, schema evolution, CI/CD for data, and operational excellence.
- Influence platform strategy and architectural decisions with clear tradeoffs and written decision records.
Required Qualifications
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related field (or equivalent experience).
- 7+ years of experience building and operating production-scale data pipelines.
- Strong Python skills for pipeline development, automation, and services.
- Strong SQL skills and hands-on experience with relational systems (preferably Aurora PostgreSQL).
- Demonstra
Similar Jobs
Flight Test Engineer
Bombardier
Data Analyst, L&C Strategy and Operations: Data Analytics and Technology Focus
Company 1 - The Manufacturers Life Insurance Company
Cybersecurity Analyst – Smart Contract Security Testing
EC Infosolutions Pvt Ltd
Automation Test Engineer – Java/Python (AI / GenAI Testing)
Spruce InfoTech Inc.
ERP & SQL - Application & Systems Analyst – Up to $110K – 3 days in office
Pearson Carter
Want AI-powered job matching?
Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.
Get Started Free