More jobs:
Job Description & How to Apply Below
Location: Southwestern Ontario
Platform / Site Reliability Engineer (SRE)
Our client is transforming industries through cutting-edge technology. Their platform leverages AI, automation, and scalable systems to solve complex real-world problems.
As a Platform / Site Reliability Engineer (SRE), you will play a key role in establishing and enhancing the engineering platform. You’ll help ensure the reliability, scalability, and efficiency of our systems while developing tools that improve engineering productivity.
You will help define and shape the platform strategy, set best practices, and drive initiatives that enhance developer experience, system performance, and operational efficiency.
What You’ll Be Doing- Dev Ops & Infrastructure:
Design, implement, and maintain scalable infrastructure to support engineering needs. - CI/CD Optimization:
Improve continuous integration and deployment pipelines using AWS CDK, including requirements for deployment and database migration tooling. - Release Tracking & Deployment:
Establish visibility into release cycles, implement automation to streamline deployments, and ensure smooth rollouts. - Site Reliability & Observability:
Implement monitoring, logging, and alerting systems to ensure high availability and performance. - Internal Tooling:
Build and maintain tools that improve developer efficiency, automate repetitive tasks, and enhance productivity. - Security & Compliance:
Ensure infrastructure and deployments align with security best practices, with attention to SoC, ISO, and GDPR standards.
- 7+ years of technical experience, with 5+ years as an SRE or similar role. Startup experience is a plus.
- Deep expertise in AWS, including Fargate and Kubernetes for container orchestration.
- Strong experience with CI/CD pipelines, particularly using AWS CDK.
- Proficiency with observability tools (Datadog, Prometheus, Grafana).
- Strong knowledge of scaling strategies and highly available architectures.
- Proficiency in scripting/automation with Python, Bash, or Type Script.
- Familiarity with security best practices and compliance frameworks (SoC, ISO, GDPR).
- Strong collaboration skills and ability to work cross-functionally.
- Infrastructure: AWS, Fargate, Redis, Postgre
SQL, SQS, CDK, Git Hub, Retool - Backend:
Django REST framework, Celery - Frontend:
Next.js, Tailwind CSS - LLM Integrations:
OpenAI, Claude, AWS Bedrock
Position Requirements
10+ Years
work experience
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×