×
Register Here to Apply for Jobs or Post Jobs. X

Lead Site Reliability Engineer

Job in Toronto, Ontario, C6A, Canada
Listing for: RBC
Full Time position
Listed on 2026-02-28
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing, IT Support, SRE/Site Reliability
Salary/Wage Range or Industry Benchmark: 100000 - 125000 CAD Yearly CAD 100000.00 125000.00 YEAR
Job Description & How to Apply Below

What is the opportunity?

Join our Commercial, Core Banking and Payments Technology (CCBPT) team as a Lead Site Reliability Engineer, where you'll play a key role in supporting our cloud and distributed environments for the SRE & Production Operations team. This exciting opportunity will challenge you to work with cutting‑edge technologies, including AI and emerging innovations, and collaborate closely with development teams to deliver embedded SRE solutions.

As a vital link between QE, Dev Ops, Development, Infrastructure and Support teams, you'll leverage your strong technical skills to solve complex problems and drive success across multiple components and technologies. If you're passionate about tackling new challenges and developing innovative solutions, we invite you to join our team and take your career to the next level.

Job Description What will you do?
  • Manage a team of SREs
  • Automate, automate and automate – Identify, design, write and automation procedures using AI, Ansible and other relevant technologies
  • Support applications running on multiple platforms including Open Shift and distributed systems
  • Design and implement Chaos Engineering experiments and Disaster Recovery procedures to test and validate system resilience and reliability
  • Establishing and monitoring SLO and supporting SLIs for various applications
  • Responsible for developing and establishing observability strategies for applications
  • Build and implement monitoring and alerting, anomaly detection, self‑healing and reliability testing for applications in scope
  • Provide leadership and technical support for developers and Dev Ops engineers
  • Support incident management and problem management for applications in scope and RCA Action items fulfillment/ownership
  • Be an escalation point in the on‑call rotation, and support our maintenance, scheduled work, support, and release deployment requirements
What do you need to succeed? Must‑have
  • 7+ years of experience as Site Reliability Engineer
  • A Bachelor’s degree in Computer Science or related technical field or equivalent practical experience
  • Strong Kubernetes and Cloud working knowledge with experience and understanding of CICD pipeline and Dev Ops / Agile Methodology
  • Advanced knowledge of the following SRE practices and technologies:
    Shell scripting, Open Shift, Linux, Dynatrace, Pager Duty, Moog, Splunk, Elastic, Ansible, Grafana, Chaos Engineering, MQ, Kafka, Windows Servers, MS SQL Server, Mainframe technologies.
  • Perform production support role, including off‑hours support
  • Effective negotiation skills, and stakeholder management
  • Excellent communication skills
Nice‑to‑have
  • Strong knowledge in AI and building AI-based solutions
  • Knowledge of deploying and supporting distributed applications
  • In‑depth hands‑on experience in a variety of SRE tools (Ansible, Catchpoint)
  • Experience working as an SRE within the Financial Industry
What’s in it for you?

We thrive on the challenge to be our best, progressive thinking to keep growing, and working together to deliver trusted advice to help our clients thrive and communities prosper. We care about each other, reaching our potential, making a difference to our communities, and achieving success that is mutual.

  • A comprehensive Total Rewards Program including bonuses and flexible benefits, competitive compensation, commissions, and stock where applicable
  • Leaders who support your development through coaching and managing opportunities
  • Work in a dynamic, collaborative, progressive, and high‑performing team
  • Opportunities to do challenging work in AI and emerging technologies
  • Opportunities to take on progressively greater accountabilities
  • Access to a variety of job opportunities across business and geographies

#TECHPJ

Job Skills

Agile Methodology, Automation, Cloud Management, Cloud Software, Dynatrace Administration, Dynatrace APM, Group Problem Solving, IT Automation, IT Systems Integration, Mainframe Technologies, Microsoft Cloud, Microsoft Windows, Organizational Leadership, Product Services, Red Hat Open Shift, Software Development Life Cycle (SDLC), SRE Observability, System Applications, System Integration Testing (SIT), Systems Software

Additional Job Details

Address: RBC…

Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary