Site Reliability Engineer (SRE) – Observability
Astra North Infoteck Inc.Resume Keywords to Include
Make sure these keywords appear in your resume to improve ATS scoring
Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score
Job Description
Job Description: Site Reliability Engineer (SRE) – Observability
Toronto - Hybrid (1-2 days office)
Role Summary
We are looking for a Observability Engineer to help implement, operate, and improve observability capabilities across our applications and platforms. This role focuses on hands-on onboarding, instrumentation, dashboarding, and alerting, working under established standards and guidance from senior engineers.
You will collaborate with application, SRE, and operations teams to ensure systems are observable, supportable, and production-ready.
Key Responsibilities
Observability Implementation
- Implement and maintain metrics, logs, and traces for applications and infrastructure
- Assist with onboarding applications into observability platforms (e.g., Dynatrace, ELK, Datadog)
- Configure dashboards, alerts, and basic anomaly detection Application Support & Instrumentation
- Work with development teams to enable structured logging, basic distributed tracing, and core metrics
- Validate observability requirements during Production Readiness Reviews (PRR)
- Troubleshoot missing or low-quality telemetry Monitoring & Alerting
- Configure alerts based on golden signals (latency, errors, traffic, saturation)
- Help reduce alert noise by tuning thresholds and alert logic
- Support incident response by gathering logs, metrics, and traces Operations & Reliability
- Support root cause analysis using observability tools
- Maintain dashboards and documentation used by on-call and support teams
- Participate in on-call rotations (as applicable) Automation & Continuous Improvement
- Assist in automating observability onboarding and validation tasks
- Create and maintain reusable dashboards and alert templates
- Follow established observability standards and best practices Required Qualifications
- 2–4 years of experience in Observability, or SRE
- Working knowledge of metrics, logs, and basic tracing concepts
- Hands-on experience with at least one observability platform (Dynatrace, Elastic/ELK, Datadog, New Relic, etc.)
- Basic understanding of SLIs/SLOs and service health indicators
- Experience with cloud platforms or hybrid environments
- Ability to write scripts (Python, Bash, PowerShell) for automation and troubleshooting
Preferred Qualifications
- Experience with OpenTelemetry or APM agents
- Familiarity with Kubernetes or containerized workloads
- Experience working with incident management tools (PagerDuty, ServiceNow)
- Exposure to Dynatrace/Kibana ELK or similar cloud-native monitoring
- Experience in regulated or enterprise environments
Similar Jobs
Palantir Data Engineer - 4+ Years - Pan India
Crescendo Global
DevOps Engineer - TS/SCI
Leidos
Azure DevOps Automated Manual Tester New York NY
AHU Technologies Inc
Salary $150K - Azure Build Engineer (.NET Azure DevOps) - WA
Bellatrix Systems
Zoom AI DevOps Engineer
Zoom
More Jobs at Astra North Infoteck Inc.
View all →MS Intune Endpoint Management Engineer
Astra North Infoteck Inc.
Full Stack Developer - React.js, Node.js, SAP Commerce Cloud
Astra North Infoteck Inc.
MS SQL Server Developer
Astra North Infoteck Inc.
Production Support Engineer - BFSI Domain
Astra North Infoteck Inc.
Production Support Engineer BFSI Domain
Astra North Infoteck Inc.
Want AI-powered job matching?
Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.
Get Started Free