Cloud Infrastructure Manager
Listed on 2026-01-13
-
IT/Tech
Cloud Computing, Systems Engineer, IT Project Manager, SRE/Site Reliability
Cloud Infrastructure Manager
Lead and scale our cloud platforms, boost reliability, drive Dev Ops excellence, and guide a talented team powering engineering at
Changing the world of pet food for good. We’re , a dog food subscription company creating truly tailored food for every dog we serve. We collect a few simple questions about each dog and use that data to craft a unique recipe that delivers optimal nutrition and taste to the door every month.
TeamThe Cloud Infrastructure Team is made up of Dev Ops engineers who keep our AWS environment in tip‑top shape, support the core website, and maintain the factory’s cloud infrastructure.
RoleThe Cloud Infrastructure Manager is responsible for the architecture, reliability, security, and continuous improvement of ’s cloud platforms. You will build and lead a high‑performing team focused on cloud engineering, Dev Ops enablement, automation, and cyber‑resilient infrastructure.
Key Responsibilities- Lead the strategic design, delivery, and optimisation of cloud infrastructure across IaaS, PaaS, and SaaS environments.
- Define and evolve scalable, secure, cloud‑native architectures that meet business, security, and engineering requirements.
- Oversee core platform domains including compute, networking, storage, identity, container platforms, serverless services, and observability tooling.
- Champion infrastructure‑as‑code, automation, and Git Ops practices to improve reliability, consistency, and deployment velocity.
- Plan and lead architectural evolution, upgrades, and platform improvements to ensure cloud services remain secure, scalable, and aligned with future business needs.
- Develop clear, forward‑looking platform roadmaps that enhance availability, performance, scalability, and long‑term architectural maturity.
- Drive modern Dev Ops practices through robust CI/CD pipelines, automation‑first approaches, and efficient deployment workflows.
- Embed reliability engineering standards through effective monitoring, logging, alerting, and well‑defined SLOs/SLIs.
- Lead structured post‑incident reviews and drive improvements that strengthen system resilience and reduce repeat issues.
- Enhance cloud security posture through security‑by‑design principles, automated guardrails, and close collaboration with Cyber Security.
- Ensure compliance with internal policies, regulatory obligations, and industry security frameworks.
- Own and continually improve backup, disaster recovery, and wider business resilience capabilities.
- Plan, forecast, and optimise cloud spend using Fin Ops‑aligned practices to ensure efficient and transparent use of cloud resources.
- Provide clear insights into cloud consumption, cost trends, and optimisation initiatives to support strategic decision‑making.
- Lead, mentor, and develop a high‑performing team across Cloud Infrastructure, Dev Ops, and Platform Security.
- Foster a culture grounded in engineering quality, innovation, continuous learning, and shared ownership.
- Ensure the team remains current with emerging technologies, cloud trends, and modern engineering practices.
- Bachelor’s degree in Computer Science, Information Technology, or equivalent professional experience.
- 2+ years experience in a Dev Ops leadership role.
- Strong hands‑on AWS experience with multi‑cloud exposure desirable.
- Expertise with infrastructure‑as‑code tooling such as Terraform, Pulumi, or similar.
- Deep understanding of cloud networking, container platforms and modern platform engineering tooling.
- Strong knowledge of cyber security principles, cloud security controls, and compliance frameworks.
- Proven experience building and managing CI/CD pipelines and automation workflows.
- Solid understanding of SRE and reliability engineering practices.
- Cloud certifications such as AWS Solutions Architect, Azure Architect, or GCP Professional Cloud Engineer.
- Dev Ops or SRE‑related certifications.
- Experience with service mesh technologies, distributed systems, or advanced observability tooling.
- Familiarity with Fin Ops practices and related tooling.
- Strong leadership skills with the ability to inspire, support, and develop technical teams.
- Analytical and pragmatic problem‑solver with a strong…
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: