×
Register Here to Apply for Jobs or Post Jobs. X

Senior SRE: AI​/ML HPC Infra & GPU Cluster

Job in Toronto, Ontario, C6A, Canada
Listing for: Boson AI
Full Time position
Listed on 2026-01-10
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing, AI Engineer, SRE/Site Reliability
Job Description & How to Apply Below
A technology-driven AI company is seeking a Site Reliability Engineer to manage and optimize their advanced GPU cluster in Toronto. You'll be engaged in planning, deployment, and operation of HPC infrastructure while working closely with engineering teams. Ideal candidates will have a strong foundation in Linux systems, Kubernetes, and significant experience in SRE or HPC operations, along with scripting skills in Python and Bash.

This full-time role offers a dynamic work environment in AI technologies.
#J-18808-Ljbffr
Position Requirements
10+ Years work experience
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary