Skip to main content
Cialfo logo

Senior Data Engineer

Cialfo
Full Timesenior
Delhi, IndiaPosted 2 days ago

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

SQLRailsSnowflakeGitHubJiraRESTdbtAgile

Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score

Job Description

<h2><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">About Manifest Global</span></h2> <p><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Manifest Global is building the infrastructure for global human capital mobility -connecting students, schools, universities, and employers across 50+ countries. Our portfolio spans <strong>Cialfo</strong> (AI-powered college counseling, 2,000+ schools), <strong>BridgeU</strong> (university guidance for international schools globally), <strong>Kaaiser</strong> (trusted study abroad counseling across India and Southeast Asia), and <strong>Explore</strong> (AI-powered university outreach, 1,000+ university partners). Together, we move talent across borders at scale. $80M raised. Still early.</span></p> <h2><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">About This Role</span></h2> <p><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Manifest Global operates four brands across 50+ countries, generating data across thousands of schools, hundreds of thousands of students, and 1,000+ university partners. Counselor behaviour, student application journeys, university conversion rates, placement outcomes, attribution revenue - it's all there. The data exists. The question is whether the infrastructure around it is good enough to make it useful.</span></p> <p><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Right now, the data platform works. Pipelines run, the warehouse holds data, the BI layer surfaces reports. But Manifest is growing - new brands, new markets, new activation use cases - and the infrastructure needs to scale with it. There are pipelines that need to be more reliable. Transformation logic that needs to be cleaner. Warehouse design that needs to handle more volume without degrading performance. And an activation layer - reverse ETL, operational analytics, data flowing into the tools the business actually uses - that is still being built.</span></p> <p><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">As a <strong>Senior Data Engineer</strong>, you will own significant parts of the data platform end to end - ingestion, transformation, warehouse, activation - and you will be one of the people who determines whether Manifest's data infrastructure is a genuine competitive advantage or a persistent constraint. You will work closely with Principal Engineers, Product, and business stakeholders across all four brands, and you will be expected to operate with the ownership and judgment of someone who has built production-grade data systems before.</span></p> <blockquote> <p><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"><strong>What makes this role different:</strong> Manifest has real data - cross-brand, multi-geography, commercially significant data. The stack is modern: Snowflake, dbt, Hevo, Airtable, Metabase. The problems are real. And when the data infrastructure surfaces the right insight, it changes a decision that affects real students and real institutions.</span></p> </blockquote> <blockquote> <p><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"><strong>AI is central to how we build:</strong> This isn't just a data engineering role - it is a role where you will actively design and build AI infrastructure that accelerates the team's own development velocity. We use <strong>Snowflake Cortex AI with Claude</strong> in our daily engineering workflow - for debugging, RCA, query optimisation, and pipeline analysis. We have already cut root cause analysis time. The next step is embedding AI deeper: automated ticket handling, intelligent monitoring, and AI-assisted development tooling that lets the team move faster without sacrificing reliability.</span></p> </blockquote> <h2><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">What You Will Own</span></h2> <h3><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">1. AI Infrastructure for Data Engineering</span></h3> <ul> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"><strong>Design and build AI-assisted development tooling</strong> - LLM-powered code generation for dbt models, SQL transformations, and pipeline scaffolding that dramatically reduces time-to-production for new data assets</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"><strong>Build intelligent data quality and anomaly detection systems</strong> - AI-driven monitoring that learns normal patterns across pipelines and surfaces anomalies before they propagate downstream, replacing manual threshold-based alerting</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"><strong>Implement AI-augmented data cataloguing and lineage</strong> - automated documentation generation, schema understanding, and semantic tagging so engineers spend less time writing docs and more time building</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"><strong>Develop AI-powered pipeline debugging and root cause analysis</strong> - tooling that diagnoses failures, traces impact through the DAG, and proposes fixes rather than requiring engineers to trace failures manually</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"><strong>Build and maintain the infrastructure that supports AI features</strong> - vector stores, embedding pipelines, retrieval layers, and model serving infrastructure that powers AI capabilities across Cialfo, BridgeU, and Explore</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"><strong>Evaluate and adopt emerging AI developer tools</strong> - stay ahead of how AI tooling (Claude, Cortex AI, GitHub Copilot, LLM APIs) can be embedded into the team's workflow to shorten feedback loops and accelerate feature delivery</span></li> </ul> <h3><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">2. Data Warehouse Design, Cost &amp; Maintenance</span></h3> <ul> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Own significant portions of the Snowflake data warehouse - schema design, performance optimisation, and the integrity of the data models that the rest of the stack depends on</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Apply strong data warehousing methodologies: dimensional modelling, layered transformation logic, clear separation between raw, staged, and served layers</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"><strong>Design and build cross-brand data primitives</strong> - shared, canonical data layers for K12, Student, University, and Application data that work consistently across Cialfo, BridgeU, and Kaaiser. This is active work and a critical foundation for the multi-brand data platform</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"><strong>Own Snowflake cost optimisation</strong> - monitor warehouse spend, identify high-cost queries and sync jobs, right-size warehouse configurations, and drive measurable reductions in monthly compute spend.</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Ensure the warehouse handles increasing data volumes from across all four brands without degrading query performance or downstream reliability</span></li> </ul> <h3><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">3. ETL/ELT Pipelines, Scheduling &amp; Transformation Logic</span></h3> <ul> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Design, build, and maintain production-grade data pipelines - ingestion via Hevo or similar, transformation via dbt, SQL-based logic that is clean, documented, and maintainable</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"><strong>Design and manage Snowflake Task DAGs</strong> - build and maintain dependency-chained task graphs for ingestion, LLM processing, and sync workflows. Understand how to structure root tasks, child tasks, CRON scheduling, warehouse assignment, and failure isolation so pipelines don't cascade-fail</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"><strong>Own Airtable as an operational data layer</strong> - manage Snowflake-to-Airtable and Airtable-to-Snowflake sync workflows, including the <code>SNOWFLAKE_TO_AIRTABLE_TABLELIST</code> sync config, upsert logic, incremental filters, and sync cost optimisation. Airtable is both a key data source and a reporting destination across brands</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Build and own <strong>reverse ETL workflows</strong> that activate warehouse data into operational tools - getting the right data into the hands of the teams that need it, not just into dashboards</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Take full ownership of pipeline failures: root cause identification, fix, downstream impact analysis, and prevention - not just resolution</span></li> </ul> <h3><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">4. Data Quality &amp; Reliability</span></h3> <ul> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Define and enforce data quality standards across datasets you own - automated validations, delta checks, row counts, time-based monitoring</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Build monitoring and alerting that surfaces problems before they reach the business</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Document data lineage, transformation assumptions, and technical decisions so the platform is understandable and maintainable as the team grows</span></li> </ul> <h3><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">5. BI Platform &amp; Reporting</span></h3> <ul> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Maintain and enhance the existing BI layer - Metabase and build reporting interfaces that non-technical stakeholders can actually use</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Collaborate with Product and Analytics teams to translate business needs into reliable technical solutions</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Communicate clearly across all four brands - be the person who can explain a data problem in business terms and a business problem in data terms</span></li> </ul> <h2><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">What Success Looks Like</span></h2> <p><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">You'll start by building a complete picture of the current state - which pipelines are fragile, where data quality is inconsistent, what the highest-impact improvements look like, and where activation use cases aren't yet built. You will have a point of view on where to move first.</span></p> <p><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">From there, the infrastructure will be measurably more reliable. Pipelines that were breaking will run consistently. Data quality issues will be caught early or prevented entirely. The BI layer will be getting used - not just maintained.</span></p> <p><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Over time, the data platform will be something the business genuinely relies on - fast enough to support the pace of growth, reliable enough that data consumers trust what they are looking at, and built in a way that the next engineer who joins can understand and extend without starting from scratch.</span></p> <h2><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">About You</span></h2> <h3><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Qualifications</span></h3> <ul> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Bachelor's degree in Computer Science, Engineering, or a related field - or equivalent practical experience</span></li> </ul> <h3><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Experience</span></h3> <ul> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"><strong>5+ years</strong> building and maintaining production-grade data pipelines and data warehouses - not prototypes, but systems that real business decisions depend on</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Strong experience architecting and implementing large-scale business intelligence solutions</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Strong experience in data warehouse design and advanced SQL</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Hands-on experience with the <strong>modern data stack</strong>: Snowflake (or equivalent cloud warehouse), dbt, Hevo or Airbyte or similar</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Advanced SQL proficiency: CTEs, window functions, <code>QUALIFY</code>, query optimisation, and understanding the difference between SQL that works and SQL that scales</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Experience with <strong>Snowflake-native features</strong>: Task DAGs, Dynamic Tables, Snowflake Cortex AI, <code>AI_COMPLETE</code>, warehouse sizing, and query profiling</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Experience with <strong>Airtable as an operational data layer</strong> - syncing data between Snowflake and Airtable, managing upsert logic, and keeping sync costs low</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Experience owning <strong>Snowflake cost monitoring and optimisation</strong> - identifying expensive queries, bloated sync jobs, and warehouse over-provisioning</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Experience with user-facing BI tools: Metabase, Looker, or similar</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Experience working in agile, ticket-based workflows (Jira)</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"><strong>Experience building or integrating AI/LLM tooling</strong> into data engineering workflows - specifically Snowflake Cortex AI, or LLM APIs (OpenAI, Anthropic, etc.) for structured data tasks, fuzzy matching, or pipeline automation</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Familiarity with <strong>vector databases, embedding pipelines, or retrieval-augmented generation (RAG)</strong> infrastructure is a strong plus</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Exposure to <strong>MLOps or AI infrastructure patterns</strong> - model serving, feature stores, or AI monitoring - is an advantage</span></li> </ul> <h3><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Skills &amp; Qualities</span></h3> <ul> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">You take ownership of reliability. When a pipeline fails, you don't just fix the immediate problem - you understand why it happened, what it affected downstream, and what needs to change so it doesn't happen again</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">You communicate clearly with non-technical stakeholders. You've sat in a requirements conversation with a commercial or product team, understood what they were actually asking for, and built something that answered the real question</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">You document your work well enough that someone else could maintain it - data lineage, transformation assumptions, technical decisions</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Comfortable working in a multi-brand, multi-stakeholder environment where data problems span different systems, teams, and geographies</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Strong problem-solving skills and attention to detail</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;">Ability to hold multiple priorities simultaneously and make good judgments about what to work on first</span></li> <li style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"><strong>You're genuinely excited about AI as a force multiplier for engineering teams</strong> - you actively use AI tools in your own workflow and have a point of view on how they can make data teams faster without sacrificing reliability</span></li> </ul> <p><span style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><strong>Why Manifest</strong></span></p> <p><span style="font-family: arial, helvetica, sans-serif; font-size: 10pt;">We're building the infrastructure for global human capital mobility - the rails that move students, schools, universities, and employers across 50+ countries. Cialfo is in 2,000+ schools. Explore is trusted by 1,000+ universities. BridgeU runs across the UK, Europe, and the Middle East. Kaaiser has guided students across India and Southeast Asia since 1997.</span></p> <p><span style="font-family: arial, helvetica, sans-serif; font-size: 10pt;">The opportunity is real. $700B flows annually in remittances from migrant workers. 85M workers will be missing from developed economies by 2030. We're building the operating system that changes that.</span></p> <p><span style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"><strong>$80M raised</strong> from Tiger Global, SIG, and Square Peg. Still early.</span></p> <p><span style="font-family: arial, helvetica, sans-serif; font-size: 10pt;">The team has already built the infrastructure for AI-native engineering - shared conventions, a live skills library, AI-assisted workflows across engineering, QE, product, and design. Saige is in production. Explore's AI capabilities are in production. This isn't an aspiration we're hiring you to bring to life. It's an operating system we're hiring you to extend, scale, and make permanent.</span></p> <p>&nbsp;</p>

About Cialfo

Cialfo logo

Cialfo

cialfo.co

Data EngineeringOn-site

Want AI-powered job matching?

Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.

Get Started Free