Skip to main content
Hookdeck Inc. logo

Platform Engineer Remote, Canada

Hookdeck Inc.
Full Timemid
CAPosted February 23, 2026

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

GoNode.jsGCPKubernetesRedisKafkaDevOps

Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score

Job Description

We're looking for an experienced Platform Engineer to work on the core capabilities of Hookdeck. This role spans backend development, data engineering, and DevOps—you'll be responsible for building out our platform, adapting and optimizing for growth, and maintaining uptime and resilience.

You'll work on systems that process billions of events per month, ensuring reliability, scalability, and performance at every level. This is a hands‑on engineering role where you'll design, build, and operate critical infrastructure that thousands of companies depend on.

We are looking for experienced distributed systems developers who have worked on highly concurrent, event‑driven systems. You should be comfortable with the challenges of building and operating systems at scale, from low‑level performance optimization to high‑level system architecture.

About Hookdeck

Hookdeck provides event infrastructure that ensures reliable, scalable, and observable event ingestion and routing for modern applications. Acting as a gateway between webhook providers and your backend, Hookdeck prevents dropped events, mitigates system overload, and simplifies event management with queueing, throughput control, and monitoring. Our flagship product, the Event Gateway hosts thousands of companies and developers and processes billions of events per month.

We are a decentralized, developer‑centric team that values ownership, autonomy, and quality in all aspects of your work and responsibilities.

About You

You're an experienced engineer who has built and operated distributed systems at scale. You understand the complexities of highly concurrent, event‑driven architectures and have deep experience with the tools and patterns that make them reliable. You're comfortable working across the stack—from backend services to data pipelines to infrastructure—and you're excited about the challenge of building systems that process billions of events reliably.

You're a strong problem‑solver who can debug complex issues across distributed systems, optimize for performance and cost, and design systems that scale. You value reliability, observability, and maintainability, and you're always thinking about how to make systems more resilient.

Responsibilities

  • Backend Development — Build and maintain core services that handle event ingestion, routing, and delivery
  • Data Engineering — Design and maintain data pipelines, storage systems, and analytics infrastructure
  • DevOps & Infrastructure — Manage and optimize our cloud infrastructure, deployment pipelines, and monitoring systems
  • Platform Development — Build out new platform capabilities to support product growth and new features
  • Performance Optimization — Identify and optimize bottlenecks in our systems to improve throughput and reduce latency
  • Reliability & Resilience — Design and implement systems that maintain high uptime and gracefully handle failures
  • Observability — Build and maintain monitoring, alerting, and debugging tools to ensure system health
  • Scalability — Design systems that can scale horizontally to handle growth in traffic and data volume

Example Projects

  • Design and implement a new event routing system that can handle 10x our current throughput
  • Optimize our ClickHouse data pipeline to reduce query latency and storage costs
  • Implement a new caching layer using Redis to reduce database load
  • Optimize service memory usage by reducing duplication of payload data in memory
  • Build observability tools to help debug issues across our distributed system
  • Design and implement disaster recovery procedures for critical systems

What We're Looking For

  • 5+ years of experience building and operating distributed systems at scale
  • Deep experience with highly concurrent, event‑driven systems
  • Strong experience with one or more of: Kafka, GCP, Kubernetes, Node.js, Go, ClickHouse, Redis
  • Experience with distributed systems patterns (queues, pub/sub, event sourcing, etc.)
  • Strong understanding of system reliability, observability, and performance optimization
  • Comfortable with infrastructure as code and modern DevOps practices
  • Experience with data engineering and building

Want AI-powered job matching?

Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.

Get Started Free