Skip to main content
Open Insurance logo

DevOps & Backend Engineer — Mid/Senior

Open Insurance
RemoteRemotePosted February 26, 2026

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

PythonJavaScriptTypeScriptBashShellSQLGraphQLNode.jsExpressAWSDockerKubernetesTerraformAnsibleGitHub ActionsLinuxPostgreSQLRedisDynamoDBGitHubGitLabKafkaCI/CDDevOpsMicroservicesAPI

Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score

Job Description

JOB DESCRIPTION

DevOps & Backend Engineer — Mid/Senior

InsurTech Platform | Engineering / Product Development

Location: Remote or Hybrid (if US Located)

Employment Type: Contract — Full-Time

Department: Engineering / Product Development

Experience Level: Mid/Senior (4–7+ years)

Reports To: Director of Engineering

Role Overview

We are looking for a DevOps & Backend Engineer who can bridge the gap between platform

infrastructure and application development. In this role, you will design and operate the cloud-native

infrastructure that powers our InsurTech product suite while contributing directly to backend services built

with TypeScript and Nest.js.

A critical dimension of this position is enabling and supporting our internal LLM and AI platform. You will

build the infrastructure foundations that allow our AI team to train, serve, and scale custom large

language models and AI-powered services—including GPU-accelerated workloads, model inference

endpoints, high-throughput data pipelines, and the CI/CD automation that brings AI capabilities reliably

into production across all products.

Key Responsibilities

Cloud Infrastructure & DevOps

Design, build, and maintain production-grade CI/CD pipelines (GitHub Actions, GitLab CI) with

automated testing, security scanning, and progressive deployment strategies (blue-green, canary,

feature flags).

Manage and optimize AWS infrastructure including EKS, EC2, RDS, ECR, S3, Lambda,

CloudFront, Route 53, and IAM—with a focus on cost optimization, high availability, and disaster

recovery.

Build and maintain Kubernetes clusters (EKS) with Helm charts, custom operators, autoscaling

policies, and multi-environment management (dev, staging, production).

Automate infrastructure provisioning and configuration using Terraform (primary), Ansible, and

CloudFormation with GitOps workflows and drift detection.

Implement comprehensive observability using Prometheus, Grafana, Datadog, ELK/OpenSearch,

and distributed tracing (Jaeger/OpenTelemetry) for full-stack visibility.

Design and maintain networking architecture including VPCs, security groups, load balancers,

service meshes (Istio/Linkerd), and DNS management.

AI/LLM Infrastructure Support

Provision and manage GPU-accelerated compute environments (AWS P4/P5 instances,

Inferentia, SageMaker) for LLM training, fine-tuning, and inference workloads.

Build containerized model-serving infrastructure supporting vLLM, TGI (Text Generation

Inference), NVIDIA Triton, and custom inference endpoints with autoscaling based on request

load and latency targets.

Design and operate data pipelines and storage architectures (S3, EFS, FSx for Lustre) optimized

for large-scale model training datasets and artifact management.

Implement CI/CD automation specifically for ML/AI workflows—model versioning, automated

evaluation gates, staged rollouts of model updates, and A/B inference routing.

Collaborate with the AI team to optimize GPU utilization, manage spot instance strategies, and

implement cost-aware scheduling for training jobs.

Set up monitoring dashboards for model inference latency, throughput, token usage, GPU

utilization, and cost tracking.

Backend Development

Contribute to and extend backend services built with Nest.js and TypeScript, focusing on

scalability, reliability, and clean architecture.

Developing internal TypeScript framework.

Build and maintain scalable microservices and RESTful/GraphQL APIs that integrate with AI

inference endpoints and the LLM Composer platform.

Design event-driven architectures using Kafka, SQS/SNS, and WebSockets for real-time data

processing and AI-powered features.

Ensure all deployments are production-ready, horizontally scalable, and follow 12-factor app

principles with proper health checks, graceful shutdowns, and circuit breakers.

Collaborate with backend and AI teams on system architecture, API contracts, database schema

design, and reliability improvements.

Implement database management best practices including migration strategies, read replicas,

connection pooling, and query optimization for PostgreSQL and Redis.

Required Skills & Qualifications

4–7+ years of professional experience in DevOps, Cloud Engineering, or Platform Engineering,

with meaningful backend development experience.

Hands-on Kubernetes experience (EKS strongly preferred), including cluster administration, Helm

chart development, autoscaling, and troubleshooting.

Strong proficiency with TypeScript and Nest.js (or comparable Node.js backend frameworks like

Express, Fastify).

Deep AWS expertise across compute, storage, networking, IAM, and managed services—with

experience optimizing for cost and performance.

Strong Infrastructure-as-Code skills with Terraform; experience with modular, reusable

configurations and state management.

Solid understanding of microservices architecture, distributed systems patterns, and container

orchestration.

Experience with Docker, container registries, and container security best practices.

Proficiency with CI/CD pipeline design including automated testing, security scanning, and

deployment strategies.

Familiarity with GitOps workflows and version-controlled infrastructure management.

Strong Linux systems administration and shell scripting skills.

Preferred Qualifications (Nice to Have)

Experience provisioning and managing GPU workloads for ML/AI model training and inference in cloud environments.

Familiarity with ML model serving frameworks (vLLM, TGI, Triton, BentoML, SageMaker

Endpoints).

Experience with Kafka, event-driven architectures, and real-time streaming systems.

Familiarity with service mesh technologies (Istio, Linkerd) and API gateway management.

Experience with HIPAA, SOC 2, or other healthcare/financial compliance frameworks in cloud

environments.

Knowledge of database technologies beyond PostgreSQL—vector databases (Pinecone,

PGVector), graph databases, or time-series databases.

Experience with chaos engineering, load testing, and reliability engineering practices (SRE).

AWS certifications (Solutions Architect, DevOps Engineer, or equivalent).

Technology Stack & Tools

Category

Technologies

Languages

TypeScript, JavaScript, Python, Bash, SQL, HCL (Terraform)

Backend

Nest.js, Node.js, Express, Fastify, GraphQL

Cloud (AWS)

EKS, EC2, RDS, S3, Lambda, ECR, CloudFront, SageMaker, IAM, KMS

Containers & Orch.

Docker, Kubernetes, Helm, Kustomize, ArgoCD

IaC & Config

Terraform, Ansible, CloudFormation, Pulumi

CI/CD

GitHub Actions, GitLab CI, CodePipeline, semantic-release

AI/ML Infra

vLLM, TGI, Triton, SageMaker, GPU instances (P4/P5/Inferentia)

Monitoring

Prometheus, Grafana, Datadog, ELK/OpenSearch, OpenTelemetry,

Jaeger

Data & Messaging

PostgreSQL, Redis, Kafka, SQS/SNS, S3, DynamoDB

Security

Vault, SOPS, OPA, Trivy, Snyk, AWS Security Hub

What We Offer

A high-impact role at the intersection of infrastructure, backend development, and cutting-edge AI

platform engineering.

Opportunity to build the infrastructure backbone powering enterprise AI and LLM capabilities.

Direct collaboration with AI, backend, and product teams across multiple verticals—telemedicine,

InsurTech, analytics.

Competitive contract compensation commensurate with experience.

Access to modern cloud infrastructure, GPU resources, and industry-leading tooling.

Job Type: Contract

Pay: From $4,000.00 per month

Work Location: Remote

Want AI-powered job matching?

Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.

Get Started Free