<p><strong>About Us</strong></p>
<p>HG Insights is the global leader in technology intelligence, delivering actionable AI driven insights through advanced data science and scalable big data solutions. Our platform informs go-to-market decisions, and we influence how businesses spend millions of marketing and sales budgets.</p>
<p><strong>What You’ll Do:</strong></p>
<ul>
<li>Design, build, and optimize large-scale distributed data pipelines for processing billions of events from and to multiple sources of data.</li>
<li>Architect and scale enterprise-grade big-data systems, including data lakes, ETL/ELT workflows, and syndication platforms for customer-facing products.</li>
<li>Orchestrate pipelines and workflows with Airflow.</li>
<li>Write optimized data into analytical databases such as ClickHouse, DuckDB, and Redshift</li>
<li>Ensure data quality, consistency, and reliability across the pipeline.</li>
<li>Monitor pipeline performance and troubleshoot data issues.</li>
<li>Collaborate with product teams to develop features across databases, and backend services.</li>
<li>Implement cutting-edge solutions for data ingestion, transformation, and analytics.</li>
<li>Drive system reliability through automation, CI/CD pipelines (Docker, Kubernetes, Terraform), and infrastructure-as-code practices.</li>
</ul>
<p><strong>What You’ll Be Responsible For</strong></p>
<ul>
<li>Development of the Data side of our Platform, ensuring scalability, performance, and cost-efficiency across distributed systems.</li>
<li>Collaborating in agile workflows (daily stand-ups, sprint planning) to deliver features rapidly while maintaining system stability.</li>
<li>Ensuring security and compliance across data workflows, including access controls, encryption, and governance policies.</li>
</ul>
<p><strong>What You’ll Need</strong></p>
<ul>
<li>BS/MS/Ph.D. in Computer Science or related field, with 7+ years of experience building production-grade big data systems.</li>
<li>Strong SQL skills and solid data modeling fundamentals with hands-on experience building ETL / ELT pipelines.</li>
<li>Proficiency in Python for data processing and integrations.</li>
<li>Familiarity with dbt, Airflow, Databricks and modern analytics engineering practices</li>
</ul>
<p><strong>Nice-to-Haves</strong></p>
<ul>
<li>Knowledge of data governance frameworks and compliance standards (GDPR, CCPA).</li>
<li>Contributions to open-source big data projects or published technical blogs/papers.</li>
<li>DevOps proficiency in monitoring tools (Prometheus, Grafana) and serverless architectures.</li>
</ul>