Site Reliability Engineering Lead
Royal Bank of CanadaResume Keywords to Include
Make sure these keywords appear in your resume to improve ATS scoring
Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score
Job Description
Job Description
What is the opportunity?
The Lead Site Reliability Engineer will be responsible for spearheading the development, and implementation of Site Reliability Engineering solutions for all applications within City National Bank (CNB), an RBC company. This team will work collaboratively with teams across several lines of business and other Technology and Operations partners as a requirement to succeed in its mandate. This individual will need advanced knowledge and experience working in an application development, support and/or technology operations organization. They should be able to take on a production support role and part with the SRE team in Commercial Banking.
What will you do?
- Own reliability for critical services, including availability, latency, performance and resilience.
- Design and maintain observability solutions (metrics, logs, traces, dashboards, alerts) with a strong signal-to-noise focus.
- Build and maintain automation for operational tasks (self-healing, runbooks, remediation, deployments, and diagnostics)
- Partner with engineering teams to enhance reliability through design reviews, production readiness reviews, and failure-mode analysis.
- Define, implement and operationalize SLIs, SLOs and error budgets aligned to business expectations.
- Drive blameless postmortems, identify systemic issues, and ensure corrective actions are implemented and tracked.
- Improve change management practices to reduce deployment risk.
- Perform capacity planning, load testing, and performance analysis to prevent incidents before they occur.
- Contribute to DR and resilience strategies.
- Mentor junior engineers and help establish consistent SRE standards and best practices across teams.
What do you need to succeed?
Must-have
- Bachelor’s degree in Computer Engineering, Computer Science or equivalent practical experience.
- 6+ years of related experience in SRE, DevOps and/or production engineering roles.
- Advanced knowledge of industry standard SRE best practices.
- Hands-on experience operating highly available systems in production.
- Proficiency in at least one programming or scripting language (Python, Go, Bash, Powershell)
- Advanced experience in a variety of environments (Linux, Windows, Databases, Cloud, and Services/APIs)
- Hands-on experience designing, operating, and troubleshooting message queue-based systems (e.g., Kafka, RabbitMQ, ActiveMQ, cloud-managed services)
- Experience supporting and operating API platforms and gateways (e.g., Apigee, Mulesoft)
- Deep experience with monitoring and alerting systems, and OpenTelemetry or unified telemetry pipelines.
- Experience with CI/CD pipelines and deployment automation.
- Solid understanding of networking, load balancing, DNS and TLS fundamentals.
- Excellent communication skills, direct style.
Nice-to-have
- Experience with cloud platforms (AWS, Azure) and cloud-native architectures.
- Experience with containers and orchestration platforms (e.g., Kubernetes, ECS, AKS)
- Familiarity with infrastructure as code (e.g., Terraform, CloudFormation)
- Experience integrating reliability tooling with ticketing, paging, and incident management systems.
- Consumer banking experience.
What’s in it for you?
We thrive on the challenge to be our best, progressive thinking to keep growing, and working together to deliver trusted advice to help our clients thrive and communities prosper. We care about each other, reaching our potential, making a difference to our communities, and achieving success that is mutual.
- A comprehensive Total Rewards Program including bonuses and flexible benefits, competitive compensation, commissions, and stock where applicable.
- Leaders who support your development through coaching and managing opportunities.
- Ability to make a difference and lasting impact.
- Work in a dynamic, collaborative, progressive, and high-performing team.
- A world-class training program in financial services.
#LI-POST
#TECHPJ
Job Skills
Agile Methodology, Group Problem Solving, IT Systems Integration, Organizational Leadership, Product Services, Software Development Life Cycle (SDLC), System Applications, System Integration Testing (SIT), Systems Software
Additional Job Details
Address:
RBC CENTRE, 155 WELLINGTON ST W:TORONTO
City:
Toronto
Country:
Canada
Work hours/week:
37.5
Employment Type:
Full time
Platform:
TECHNOLOGY AND OPERATIONS
Job Type:
Regular
Pay Type:
Salaried
Posted Date:
2026-02-11
Application Deadline:
2026-03-11
Note: Applications will be accepted until 11:59 PM on the day prior to the application deadline date above
Inclusion and Equal Opportunity Employment
At RBC, we believe an inclusive workplace that has diverse perspectives is core to our continued growth as one of the largest and most successful banks in the world. Maintaining a workplace where our employees feel supported to perform at their best, effectively collaborate, drive innovation, and grow professionally helps to br
Similar Jobs
Java PL/SQL Developer
QUANTEAM - North America (RAINBOW PARTNERS Group)
Senior AI/ML Engineer
Uber
Security Engineer, Detection & Response - Global Security Organization
TikTok
Software Security Engineer Jobs
Squires Group, Inc
Lead Software Engineer (Java)
CIBC
More Jobs at Royal Bank of Canada
View all →Senior Manager, Site Reliability Engineer
Royal Bank of Canada
Associate Director, Market Data Engineer
Royal Bank of Canada
Wealth Administrator
Royal Bank of Canada
Data Engineer (Global Security)
Royal Bank of Canada
Application Developer
Royal Bank of Canada
Want AI-powered job matching?
Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.
Get Started Free