Role Overview
Collabera is hiring a Senior Data Engineer (Python - PySpark and AWS). This is a contract role in Toronto. Part of Collabera's Data Engineering hiring. Full responsibilities, required qualifications, and the apply link are listed in the description below.
Resume Keywords to Include
Make sure these keywords appear in your resume to improve ATS scoring
Job Description
Title: Senior Data Engineer
Client: Investments Industry
# of Openings: 1
Type: 6-Month Contract (High likelihood of extension)
Location: Toronto, ON
Work Model: 4 days/week onsite, Friday WFH
PR:$80-100/hr
Role Overview
- We are seeking a Senior Data Engineer (8-10+ years experience) to support a large-scale data platform transformation within the Total Fund Management (TFM) team.
- This role will focus on migrating and modernizing existing Databricks-based pipelines to AWS (EMR Spark), with an initial lift-and-shift phase, followed by optimization and redesign into scalable, consumable data products.
- This is a highly autonomous, hands-on role requiring strong PySpark expertise, deep experience with distributed data systems, and the ability to navigate complex, multi-source datasets (including market and reference data vendors).
Day-to-Day Responsibilities
- Migrate existing Databricks-based Spark pipelines to AWS EMR (Spark)
- Perform lift-and-shift of ~50+ datasets, some with high complexity and multiple data sources
- Refactor and optimize data pipelines for performance, scalability, and reliability
- Structure and store data using Parquet and Iceberg formats
- Improve and clean up legacy data pipelines built over several years
- Design data with a consumption-first mindset (e.g., partitioning strategies, access patterns, data usability)
- Collaborate with stakeholders to understand data requirements and translate into scalable solutions
- Ensure production readiness including monitoring, orchestration, and deployment
- Work independently to drive delivery from design through implementation
Key Responsibilities
- Develop and optimize large-scale PySpark data pipelines
- Rebuild and enhance Spark workloads in AWS (EMR)
- Leverage tools such as Airflow, AWS Glue, and Lake Formation
- Handle parallel/distributed data processing workloads
- Improve system performance and data quality across pipelines
- Engage with business and technical stakeholders to align on data needs
- Own delivery with minimal oversight in a fast-paced environment
Must-Haves
- 8-10+ years of Data Engineering experience (senior-level profiles only)
- Strong hands-on expertise in Python and PySpark
- Deep experience with Apache Spark in distributed environments
- Proven experience working with large-scale, complex data pipelines
- Experience with Databricks (existing environment)
- Strong knowledge of Parquet and Iceberg data formats
- Experience with AWS data ecosystem (EMR preferred)
- Familiarity with Airflow, Glue, and Lake Formation
- Strong understanding of parallel/distributed data processing
- Ability to work independently with strong problem-solving skills
- Experience in ambiguous environments with evolving requirements
Nice-to-Haves
- Prior experience in capital markets or investment management
- Experience working with market data / reference data vendors
- Experience designing data products and consumption layers
- Exposure to large-scale data platform migrations or transformations
We may use AI-enabled and/or automated tools to support parts of our recruitment process, including application screening, interview scheduling, and candidate communications. These tools are used to enhance consistency and efficiency. All hiring decisions involve human review and are not based solely on automated processing.
The Company offers a total rewards package in accordance with all applicable federal, provincial, and local laws and requirements. Benefit eligibility and offerings vary based on role, employment status, and work location. For contractor positions, benefits are limited to those entitlements and protections required by applicable law, which may include (as applicable) vacation pay, public holidays, leaves of absence, and other legally mandated benefits or payments.
Frequently Asked Questions
How do I apply for the Senior Data Engineer (Python - PySpark and AWS) position at Collabera?
Use the Apply button above to submit your application directly to Collabera. Most applications take less than 5 minutes if your resume and contact details are ready, and you'll be routed to the employer's official application system to finish.
Where is the Senior Data Engineer (Python - PySpark and AWS) position at Collabera located?
This position is based in Toronto. Collabera has not indicated remote or hybrid options for this role, so candidates should plan for on-site work.
What does a Senior Data Engineer (Python - PySpark and AWS) at Collabera earn?
Collabera has not disclosed a salary range in this posting. Many employers share specifics later in the interview process; you can also ask during a recruiter screen if compensation transparency is important to you.
When was the Senior Data Engineer (Python - PySpark and AWS) role at Collabera posted?
This role was posted on May 11, 2026 (36 days ago). It's still listed as actively hiring; we re-confirm openings against the source system multiple times per day and remove closed roles.
How much experience does the Senior Data Engineer (Python - PySpark and AWS) role at Collabera require?
This is a senior-level position. Most senior roles call for 5+ years of directly relevant experience. Collabera lists their specific requirements in the description below, so review the must-have qualifications closely before applying.
AI-powered job search
Get every job scored to your resume
Upload your resume and get jobs ranked, your resume tailored, and employee contacts found automatically.
Get Started FreeNo credit card to start