Senior Principal Machine Learning Engineer, Foundational Models
Listed on 2026-01-14
-
Software Development
AI Engineer, Machine Learning/ ML Engineer, Data Engineer
Job Requisition # 26WD94805
Senior Principal Machine Learning Engineer, Foundational ModelsPosition Overview
Autodesk is transforming the Architecture, Engineering, and Construction (AEC) industry by embedding advanced AI and foundation models into cloud-native platforms such as AutoCAD, Revit, Construction Cloud, and Forma.
As a Senior Principal Machine Learning Engineer, you will act as a technical leader and delivery owner for complex, high-impact ML initiatives spanning foundation models, reinforcement learning, data systems, and large-scale ML platforms. You will operate at the intersection of applied research, engineering, and product—setting technical direction while remaining hands‑on in the areas of highest complexity and risk.
This role is designed for a senior ML tech lead with a proven track record of owning and delivering ML systems at scale, including training and operating models in large, distributed environments.
Reporting: ML Development Manager, AEC Solutions
Location:
US or Canada (Remote or Hybrid)
Technical Strategy & Leadership:
Define the long‑term technical vision for Generative AI and Foundation Model infrastructure within the AEC Solutions team. Influence architectural decisions across the broader organization.End‑to‑End Delivery:
Lead the design, development, and delivery of complex ML systems. Own the full lifecycle from model architecture selection and data strategy to distributed training and production deployment.Foundation Model Engineering:
Drive the development of large‑scale training pipelines. Collaborate with Research Scientists to translate experimental ideas (custom architectures, novel loss functions) into scalable, performant code.Scalability & Infrastructure:
Architect solutions for distributed training (e.g., FSDP, Megatron‑LM, Deep Speed) on massive compute clusters. Identify and resolve bottlenecks in data processing and model parallelism to maximize training throughput.Mentorship & Influence:
Mentor Principal and Senior engineers, fostering a culture of technical ownership, rigorous experimentation, and best practices. Act as a technical partner to Product Management and Engineering leadership.Cross‑Functional
Collaboration:
Partner effectively with Data Engineering, Platform, and Research teams to integrate large‑scale multimodal AEC data (3D geometry, images, text) into model development workflows.Operational Excellence:
Establish standards for model evaluation, versioning, monitoring, and MLOps best practices to ensure reproducibility and reliability in a high‑stakes production environment.
Master’s or PhD in a field related to AI/ML such as Computer Science, Mathematics, Statistics, Physics, Computational Linguistics, or related disciplines.
10+ years of experience in machine learning, AI, or related fields, with a proven track record of technical leadership and hands‑on implementation.
Demonstrated experience mentoring engineers and leading technical projects in cross‑functional environments.
Proven history of leading the delivery of large‑scale ML systems from conception to production.
Expert‑level understanding of deep learning architectures (Transformers, Diffusion models) and modern frameworks (PyTorch is required).
Hands‑on experience with distributed training frameworks and techniques (e.g., PyTorch Distributed, Ray, Deep Speed, Megatron, CUDA optimization) in HPC or cloud environments (AWS/Azure).
Strong proficiency in Python, with an emphasis on performance profiling, debugging, and writing robust, maintainable production code.
Excellent ability to translate complex technical concepts into clear insights for executive leadership and cross‑functional partners.
Experience with large foundation model training in distributed compute environments.
Experience designing data pipelines for multimodal datasets at the terabyte/petabyte scale (using Spark, Iceberg, etc.).
Experience constructing internal developer platforms for ML, utilizing tools like Kubernetes, Slurm, or Metaflow.
A portfolio demonstrating the successful translation of academic research papers into tangible product features.
Bac…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).