Data Engineer to help manage data pipelines across multiple structured and unstructured sources

FreelanceJobs

Full Timemid

CAPosted February 26, 2026

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

PythonAWSDockerMongoDBGitRESTAPI

Job Description

About Us

We are building advanced property intelligence products powered by AI, geospatial datasets and real-world asset data.

Our platform ingests multiple public and proprietary data sources and transforms them into structured, standardised, usable intelligence for insurers, lenders and homeowners.

We are now looking for a highly capable Data Engineer to take ownership of sourcing, normalising, maintaining and automating our expanding data infrastructure.

Role Overview

You will be responsible for managing data pipelines across multiple structured and unstructured sources, ensuring data consistency, quality and usability across our system.

This role combines:

Data engineering

API integrations

Web crawling and scraping

Data normalisation and transformation

AI-assisted coding (Claude Code essential)

Ongoing data maintenance and optimisation

You will play a critical role in ensuring that the data powering our products is accurate, scalable and production-ready.

Key Responsibilities

Data Ingestion & Integration

Identify, access and integrate data from APIs, open datasets and third-party providers

Build and maintain automated data ingestion pipelines

Design scalable workflows for ongoing updates

Data Normalisation & Structuring

Clean, transform and standardise disparate datasets into unified schemas

Resolve inconsistencies in formatting, units, classifications and identifiers

Create mapping logic between different source taxonomies

Ensure compatibility with internal data models

Web Crawlers & Scrapers

Design, build and maintain robust crawlers

Handle pagination, rate limiting

Monitor and update crawlers as source websites evolve

Implement error handling and logging

Database Management

Design and maintain database schemas

Optimise queries for performance

Ensure data integrity and version control

Work with MongoDB databases

API Development & Maintenance

Integrate third-party APIs

Maintain and monitor API reliability

Develop internal APIs where required

Manage authentication, rate limits and error handling

AI-Assisted Development (Essential)

Use Claude Code as part of development workflow

Write, refine and maintain prompts for code generation

Implement AI-assisted debugging and optimisation

Maintain structured prompt libraries for repeatable engineering tasks

Data Quality & Monitoring

Build validation scripts

Create automated anomaly detection checks

Implement logging and monitoring systems

Maintain documentation of data sources and transformation logic

Required Skills & Experience

Technical

Strong Python experience (essential)

Experience building and maintaining web crawlers

Deep understanding of APIs (REST, authentication, rate limiting)

Experience with NoSQL DBs

Experience with data cleaning and normalisation techniques

Familiarity with Git and version control

Experience with Docker or containerised workflows

Essential

Hands-on experience using Claude Code in a production workflow

Ability to structure and optimise AI-assisted development processes

Strong problem-solving and debugging capability

Desirable

Experience working with geospatial data

Experience with large UK public datasets (e.g. ONS, Land Registry, EPC)

Experience with cloud environments (AWS)

We anticipate this role being more intensive at the outset, requiring a concentrated effort over the first few days to establish and structure the data pipelines.

Following this initial phase, the role would transition to a smaller, ongoing monthly allocation of hours to maintain, monitor and update the data.

Please do not apply for this role using AI-generated content alone. Applications that appear to be entirely AI-produced, without genuine personal input or relevant detail, will be automatically discounted.

Contract duration of less than 1 month. with 30 hours per week.

Mandatory skills:

Python, Data Scraping, API Development, Oracle NoSQL Database, Git, Claude code, Web Crawling, Docker, Data Transformation, Data Preprocessing

All jobs at FreelanceJobs →