Skip to main content
PolarGrid logo

Backend Engineer

PolarGrid
Full Timemid
Manitoba, CAPosted 10 weeks ago

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

PythonTypeScriptGoRustAWSDockerKubernetesRedisgRPCKafkaDevOpsMicroservicesAPI

Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score

Job Description

Role Overview

We're seeking a Backend Engineer to build and scale our edge inference infrastructure. You'll architect distributed compute systems handling GPU-accelerated AI workloads across edge nodes with sub-10ms latency requirements.

Core Responsibilities

  • Infrastructure Engineering
  • Design and implement Kubernetes-native distributed compute platforms
  • Build GPU resource management and allocation systems
  • Develop edge deployment pipelines with automated testing
  • Create high-performance inference serving infrastructure

Backend Systems

  • Architect microservices for distributed model serving
  • Implement API gateways with OpenAI and Hugging Face-compatible endpoints
  • Build dynamic resource allocation and load balancing
  • Design multi-backend systems with mutual exclusivity enforcement

Performance & Optimization

  • Optimize GPU memory utilization and inference latency
  • Implement streaming inference with TensorRT acceleration
  • Build comprehensive monitoring and observability systems
  • Design automatic scaling based on workload patterns

Required Technical Skills

Core Infrastructure

Kubernetes: Production experience with cluster management, resource allocation, networking

Containerization: Docker, container security, multi-stage builds, optimization

Distributed Systems: Service mesh, load balancing, distributed consensus, fault tolerance

Cloud: GitOps, infrastructure as code, AWS, CDK

Backend Development

Languages: TypeScript, Go, Python, or Rust

APIs: RESTful services, gRPC, WebSocket streaming, rate limiting

Databases: Distributed databases, caching systems, data consistency

Message Queues: Kafka, Redis, SQS, distributed event systems

AI Inference Infrastructure

GPU Computing: NVIDIA CUDA, TensorRT, GPU memory management

AI/ML Serving: Triton Inference Server, model optimization, batch processing

Performance: Latency optimization, throughput tuning, resource profiling

Preferred Experience

Infrastructure Platforms

  • Edge computing deployments
  • Multi-region distributed systems
  • Hardware acceleration (GPUs)
  • Container security (Kata, gVisor)

Monitoring & Operations

  • Prometheus, Grafana, distributed tracing
  • SRE practices, incident response
  • Capacity planning, cost optimization
  • Automated testing and deployment

What You'll Build

Edge Inference Platform

  • Multi-tenant GPU inference clusters serving 10,000+ concurrent requests
  • Sub-10ms latency requirements with geographic distribution
  • Automatic model loading and resource optimization
  • Comprehensive health monitoring and alerting

Backend Architecture

  • Microservices handling model lifecycle management
  • API gateway with authentication and rate limiting
  • Dynamic backend switching (Python/TensorRT-LLM)
  • Streaming inference with WebSocket support

DevOps Infrastructure

  • Kubernetes operators for inference workload management
  • Automated testing covering performance and reliability
  • GitOps deployment with rollback capabilities
  • Cloud and edge resource monitoring and cost optimization

About PolarGrid

PolarGrid logo

PolarGrid

BackendOn-site

Want AI-powered job matching?

Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.

Get Started Free