Model-as--Service Tech Lead Job Santa Clara area,California USA,IT/Tech

Position: Model-as-a-Service Tech Lead

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It's a unique legacy of innovation that's fueled by great technology-and amazing people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world.

Doing what's never been done before takes vision, innovation, and the world's best talent. As an NVIDIAN, you'll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.

We are seeking a highly qualified and hands-on Tech Lead to drive the technical vision, architecture, and implementation for our scalable, web-based platform. This critical platform enables users to configure autonomous driving scenarios and generate synthetic data at scale for model training. This is not a management role; the successful candidate will spend a large part of their time actively writing code and leading code quality initiatives.

A key responsibility is to design the entire platform for maximum portability and broad adoption across the NVIDIA ecosystem. This requires ensuring all services and deployment pipelines are not tied to proprietary internal infrastructure. The Tech Lead will champion the use of web and batch processing patterns to efficiently expose our premier foundation models, driving their real-world deployment and utility.

The ideal candidate will be a full-stack expert who leads by example, actively coding and mentoring engineers. They take full responsibility for building the system, its performance, and deployment in a cloud-centric environment.

What you'll be doing:

Serve as the primary, high-impact contributor on complex features. Dedicate significant time to producing production code across the full stack, including UI, APIs, services, and infrastructure.
Code Review Leadership & Quality Assurance:
Lead the code review process, setting and implementing thorough coding standards, performance benchmarks, and architectural integrity to ensure all merged code is high-quality, maintainable, and robust.
Architectural Ownership & Portability:
Define and own the long-term technical roadmap, architecture, and design. This includes the required assurance that the deployment pipelines and services are platform-agnostic and easily deployable across the broader NVIDIA ecosystem, deliberately avoiding internal infrastructure dependencies.
Foundation Model Deployment Strategy:
Lead the strategic implementation of web services and efficient batch processing queues to seamlessly integrate and operationalize our world foundation models into the customer-facing platform.
System Performance & Reliability:
Implement and make sure standards for production-grade performance, monitoring, and fault tolerance across all services. Proactively identify and resolve systemic technical debt and scalability bottlenecks.
Deployment & Operational Excellence:
Take ultimate ownership of the CI/CD pipelines, container orchestration strategy (Kubernetes/Helm), and operational readiness, ensuring seamless scalability and reliability in production.
Team Mentorship & Guidance:
Mentor and guide the engineering team on advanced practices in full-stack development, distributed systems design, performance optimization, and clean, portable code architecture.
Multi-functional Partnership:
Act as the key technical liaison, translating complex requirements from Product Managers, ML Engineers, and Data Scientists into robust, portable, and implementable designs.

What we need to see:

This role requires a proven track record of significant experience and technical mastery:

Minimum 12+ years of hands-on experience developing and deploying scalable full-stack web services in a cloud environment.
Proven Tech Lead or equivalent Senior/Staff level experience with demonstrated ability to define system architecture, mentor engineers, and take end-to-end technical ownership of a major platform while remaining deeply active in coding and code reviews.
Expert-level proficiency in designing and scaling distributed microservices architectures using gRPC and REST APIs.
Deep expertise in modern frontend frameworks and building highly responsive, data-intensive UIs capable of managing high-frequency data flows.
Direct experience designing and deploying containerized applications that use a GPU (e.g., NVIDIA Container Toolkit).
Experience with MaaS (Model-as-a-Service) patterns and serving large machine learning models as high-throughput endpoints.
Mastery of container orchestration, including Kubernetes and Helm for sophisticated, portable, multi-service production deployments.
Proficiency in backend languages such as Python and/or Go, and Type Script for the frontend.
Strong practical experience with Cloud Infrastructure (AWS S3) and running complex data storage/access…


Increase/decrease your Search Radius (miles)



Job Posting Language