Skip to main content
FreelanceJobs logo

Python Developer: CMS-Agnostic Web Crawler

FreelanceJobs
CAPosted March 7, 2026

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

PythonJavaScriptRustAPI

Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score

Job Description

The core of this project is to build an automated pipeline that can analyze how digital assets are structured and verified across various e-commerce platforms.

Key Responsibilities:

CMS-Agnostic Crawling:

Build/configure a crawler (using tools like Crawl4AI, Playwright, or self-hosted headless browsers) that extracts high-res media assets from any URL regardless of platform (Shopify, Magento, Custom, etc.).

Binary Metadata Inspection:

Integrate open-source libraries (Python/Rust) to inspect image binaries for C2PA/JUMBF headers and cryptographic signatures.

Multimodal AI Logic:

Implement an AI reasoning layer using Gemini 1.5 Flash to perform visual analysis on extracted assets based on specific prompts.

Automated Reporting:

Develop a backend logic that aggregates findings and generates a professional PDF report (using PDFKit or similar).

Technical Requirements:

Expertise in Python (Asyncio, HTTPX).

Experience with Advanced Web Scraping (Bypassing bot detection, handling JS-heavy sites).

Knowledge of Digital Asset Metadata (EXIF, XMP, and specifically JUMBF/C2PA).

Experience with Large Language Model APIs (specifically multimodal vision capabilities).

Familiarity with or n8n for workflow orchestration is a huge plus.

How to Apply

Please start your proposal with the word "Provenance" so I know you've read the requirements. Briefly describe your experience with digital asset metadata or building CMS-agnostic crawlers.

Contract duration of 1 to 3 months.

Mandatory skills:

Python, JavaScript, API, , Make Build Script, c2pa

Want AI-powered job matching?

Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.

Get Started Free