Skip to main content
Epsilon Solutions LTD logo

Site Reliability Engineer (SRE)

Epsilon Solutions LTD
Full Timemid
CAPosted March 6, 2026

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

AWSGCPAzureKubernetesCI/CDDevOps

Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score

Job Description

Site Reliability Engineer (SRE) – Azure, Kubernetes, CI/CD & Application Migration Support

6-12 Months

Video Interview

Remote

Role Overview

We are looking for a Site Reliability Engineer (SRE) with strong expertise in Azure, Kubernetes, CI/CD, and application migration support to ensure the reliability, scalability, and operational readiness of production applications.

This role supports a multi cloud environment, with Azure as the primary cloud and AWS/GCP as secondary or optional, and focuses on the SRE Applications layer.

Key Responsibilities

  • Design, implement, and support CI/CD pipelines for containerized applications on Azure Kubernetes platforms.
  • Embed reliability, security, and governance controls into CI/CD workflows.
  • Improve deployment reliability and enable safe, repeatable releases.
  • Provide application migration support for workloads moving to Azure based Kubernetes environments.
  • Support migration runbooks, validation, PRRs, and post migration stabilization.
  • Operate and support Kubernetes workloads on Azure (AKS) ensuring high availability, scalability, and resilience.
  • Define, implement, and track SLOs and SLAs; monitor service health using SLIs.
  • Perform capacity forecasting, quota management, and proactive scaling risk identification.
  • Execute application benchmarking and drive continuous optimization across performance, reliability, and cost.
  • Participate in incident response, root cause analysis, and continuous improvement.
  • Follow and contribute to Operations SOPs (Application, Platform, Network, Security).

Required Skills & Experience

  • Experience in SRE, DevOps, or Production Operations roles.
  • Strong understanding of SLO, SLA, and reliability engineering principles.
  • Strong experience with Microsoft Azure and Azure Kubernetes Service (AKS).
  • Hands on experience operating Kubernetes in production environments.
  • Strong experience designing and supporting CI/CD pipelines for containerized applications.
  • Experience with observability tools such as Dynatrace, Prometheus, Grafana.
  • Familiarity with artifact repositories (e.g., Nexus).
  • Exposure to security and vulnerability tools (e.g., Qualys, Orca).
  • Working knowledge of AWS and/or GCP is a plus.

Thanks & Regards

Aman Sharma

Direct Number:408-459-7174

e-mail : aman.s@epsilonsolutions.ca

Mississauga, Canada

Want AI-powered job matching?

Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.

Get Started Free