×
Register Here to Apply for Jobs or Post Jobs. X

Senior Site Reliability

Remote / Online - Candidates ideally in
Toronto, Ontario, M5A, Canada
Listing for: Bamboo Rose
Remote/Work from Home position
Listed on 2026-02-28
Job specializations:
  • IT/Tech
    SRE/Site Reliability, Cloud Computing, Systems Engineer
Job Description & How to Apply Below
At Bamboo Rose, we’re building the world’s leading collaborative planning, product development, and supply chain platform for global retail. Our technology helps retailers and brands bring great products to market faster, smarter, and more sustainably. We value curiosity, innovation, and solving real problems across global supply chains

About the Role
We’re looking for an experienced Senior SRE to help design, implement, and scale reliability practices across our systems and infrastructure. This role blends hands‑on engineering with strong ownership, collaboration, and influence, and plays a key part in establishing SRE ways of working as the organization continues to mature its reliability posture.

What You’ll Do

Design, implement, and evolve reliability practices aligned with SRE principles across the Total

PLM stack.

Monitor, analyze, and optimize system performance, availability, and reliability.

Build and improve automation to reduce toil and increase operational efficiency.

Partner with Software Development and Customer Support teams to ensure reliable delivery and operation of services.

Define, track and analyze SLI/SLO metrics.

Participate in incident response, post‑incident reviews, and root cause analysis, driving remediation.

Contribute to defining and rolling out SRE standards, patterns, and best practices.

Mentor and support junior Site Reliability through knowledge sharing and hands‑on guidance.

Manage and drive technical projects related to reliability, automation, and infrastructure improvements.

What You Bring

6+ years of progressive experience in SRE, Dev Ops, Platform, or equivalent roles.

Demonstrated ownership of production systems and experience operating in an on‑call environments.

Strong knowledge of Dev Ops practices, Git Ops, CI/CD pipelines, and Infrastructure as Code (IaC) tools.

Experience with automation tools such as Ansible, Puppet or equivalent, monitoring and observability tools like New Relic, Data Dog, Prometheus/Grafana; and incident management.

Strong project execution skills, with the ability to drive initiatives from idea through delivery.

Clear communication skills and the ability to work effectively with both technical and non‑technical stakeholders.

A mindset of accountability, continuous improvement, and learning.

Why You’ll Love Working Here

You’ll help shape and define SRE practices, not just follow them.

You’ll work on meaningful reliability challenges that directly impact the business.

You’ll have the opportunity to mentor others and grow your technical leadership skills.

You’ll collaborate with engaged engineering partners who value reliability and operational excellence.

You’ll be empowered to innovate, automate, and improve how systems are built and operated.

A chance to work on technology that makes global retail more connected, sustainable, and resilient.

Competitive compensation and benefits, with flexibility for remote work

#J-18808-Ljbffr
Position Requirements
10+ Years work experience
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary