ML Engineer Models
Listed on 2026-01-13
-
Software Development
AI Engineer, Machine Learning/ ML Engineer, Data Engineer
Location: Greater London
Requirements
- Significant hands‑on experience optimizing deep learning models
- Proven ability to profile and debug performance bottlenecks
- Experience with distributed or large‑scale training and inference
- Familiarity with techniques such as mixed precision, quantization, distillation, pruning, caching, and batching
- Experience with large models (e.g., transformers)
- Practical CUDA development experience
- Deep understanding of at least one major deep learning framework (ideally PyTorch)
- Experience building and operating ML systems on cloud platforms (AWS, Azure, or GCP)
- Comfort working with experiment tracking, monitoring, and evaluation pipelines
Optimize and own performance of AI/ML foundation model, design GPU components, reduce latency, and work with founders on optimization goals. Requires CUDA, Python, and deep learning expertise.
- Own the performance, scalability, and reliability of the company's foundation model in both training and inference.
- Profile and optimize the end-to-end ML stack: data pipelines, training loops, inference serving, and deployment.
- Design and implement GPU‑accelerated components, including custom CUDA kernels where off‑the‑shelf libraries are insufficient.
- Reduce latency and cost per inference token while maximizing throughput and hardware utilization.
- Work closely with the founders to translate product requirements into concrete optimization goals and technical roadmaps.
- Build internal tooling, benchmarks, and evaluation harnesses to help the team experiment, debug, and ship safely.
- Contribute to model architecture and system design where it impacts performance and robustness.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: