The Role
Senior ML Platform Engineer (MLOps)Rakuten Kobo Inc. is seeking a
visionary and highly skilled Senior ML Platform Engineer
to
architect, build, and lead the evolution of our internal Machine Learning Platform and MLOps capabilities.In this pivotal role, you will define the strategic roadmap and hands-on implementation for a
state-of-the-art, fully automated ML framework on the Google Cloud Platform (GCP).
You will be instrumental in
designing and developing the core infrastructure, tools, and services that empower our Data Scientists and ML Engineers
to efficiently develop, deploy, monitor, and manage their Machine Learning models throughout their lifecycle. Collaborating closely with Data Scientists, Data Engineers, Platform Engineers, and business stakeholders, you will transform manual ML production processes into a seamless, scalable, and reproducible ML Platform.
This groundbreaking position is dedicated to
streamlining the entire ML project lifecycle by providing a robust, self-service platform
, ensuring the continuous delivery of significant business value through innovative Machine Learning solutions. Success in this role demands not only profound ML engineering and platform-building expertise but also a strategic, forward-thinking mindset for seamlessly integrating ML/AI into the core of our engineering practices at scale.
Experience and Background:
in ML Engineering or related fields, with a significant portion dedicated to ML Platform development.
or significant MLOps infrastructure for an organization. This is the
most crucial
must-have.
, including:
Orchestration:Kubeflow, Airflow, Argo Workflows, Step Functions, Vertex AI Pipelines.
Experiment Tracking & Model Registry:MLflow, DVC, Vertex AI ML Metadata, Sage Maker Experiments/Model Registry.
Model Monitoring & Observability:Prometheus, Grafana, Arize, Sagemaker Model Monitor, Vertex AI Model Monitoring.
Data/Model Versioning:DVC, Git-LFS, internal systems.
Feature Stores:Feast, Hops-works, or custom-built.
CI/CD for ML:Jenkins, Git Hub Actions, Git Lab CI, Build Kite, ArgoCD (Git Ops).
Containerization & Orchestration:Docker, Kubernetes, Helm.
at scale, particularly in the context of ML development and deployment.
(predictive modeling, deep learning, GenAI/LLMs are a plus but secondary to platform expertise).
The Skillset:
Strong hands-on experience with GCP tools such as:
MLOps framework and Automation:
Software Engineering and Dev Ops:
Strong background in infrastructure-as-a-code (Terraform, Deployment Manager)
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: