Skip to main content
Astra North Infoteck Inc. logo

Site Reliability Engineer (SRE) – Observability

Astra North Infoteck Inc.
Full TimejuniorHybrid
Toronto, Ontario, CAPosted March 5, 2026

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

PythonBashKubernetes

Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score

Job Description

Job Description: Site Reliability Engineer (SRE) – Observability

Toronto - Hybrid (1-2 days office)

Role Summary

We are looking for a Observability Engineer to help implement, operate, and improve observability capabilities across our applications and platforms. This role focuses on hands-on onboarding, instrumentation, dashboarding, and alerting, working under established standards and guidance from senior engineers.

You will collaborate with application, SRE, and operations teams to ensure systems are observable, supportable, and production-ready.

Key Responsibilities

Observability Implementation

  • Implement and maintain metrics, logs, and traces for applications and infrastructure
  • Assist with onboarding applications into observability platforms (e.g., Dynatrace, ELK, Datadog)
  • Configure dashboards, alerts, and basic anomaly detection Application Support & Instrumentation
  • Work with development teams to enable structured logging, basic distributed tracing, and core metrics
  • Validate observability requirements during Production Readiness Reviews (PRR)
  • Troubleshoot missing or low-quality telemetry Monitoring & Alerting
  • Configure alerts based on golden signals (latency, errors, traffic, saturation)
  • Help reduce alert noise by tuning thresholds and alert logic
  • Support incident response by gathering logs, metrics, and traces Operations & Reliability
  • Support root cause analysis using observability tools
  • Maintain dashboards and documentation used by on-call and support teams
  • Participate in on-call rotations (as applicable) Automation & Continuous Improvement
  • Assist in automating observability onboarding and validation tasks
  • Create and maintain reusable dashboards and alert templates
  • Follow established observability standards and best practices Required Qualifications
  • 2–4 years of experience in Observability, or SRE
  • Working knowledge of metrics, logs, and basic tracing concepts
  • Hands-on experience with at least one observability platform (Dynatrace, Elastic/ELK, Datadog, New Relic, etc.)
  • Basic understanding of SLIs/SLOs and service health indicators
  • Experience with cloud platforms or hybrid environments
  • Ability to write scripts (Python, Bash, PowerShell) for automation and troubleshooting

Preferred Qualifications

  • Experience with OpenTelemetry or APM agents
  • Familiarity with Kubernetes or containerized workloads
  • Experience working with incident management tools (PagerDuty, ServiceNow)
  • Exposure to Dynatrace/Kibana ELK or similar cloud-native monitoring
  • Experience in regulated or enterprise environments

Want AI-powered job matching?

Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.

Get Started Free