GPU & LLM Infrastructure Product Manager Job Irving area,Texas USA,IT/Tech

Hi,

Our client is looking for GPU & LLM Infrastructure Product Manager at Minneapolis, Minnesota / Charlotte, NC / Irving, TX (Hybrid) below is the detailed requirements. Please share your updated resume if you are interested.

Role

GPU & LLM Infrastructure Product Manager

Location

Minneapolis, Minnesota / Charlotte, NC / Irving, TX (Hybrid)

Contract About This Role

Enterprise AI Platform- GPU & LLM Infrastructure Product Manager

You will define and lead the product strategy for enterprise-scale LLM/SLM inference GPU platform. In this role, you will partner closely with GPU hardware and platform engineering teams to translate customer needs and business objectives into a clear, prioritized roadmap with measurable outcomes.

You will own capabilities across high-performance model inferencing, GPU orchestration, and platform services, including vLLM, NVIDIA/Run:

AI, and Red Hat Open Shift AI. The role also encompasses API productization, observability and evaluation, reliability and SLOs, and compliant end-to-end lifecycle management to enable secure, scalable, and enterprise-ready AI solutions.

In This Role, You Will

Lead a team to identify, strategize and execute highly complex Artificial Intelligence initiatives that span a line of business
Recommend business strategy and deliver Artificial Intelligence enabling solutions to solve business challenges
Define and prioritize cases, obtain the required resources and ensure the solutions deliver the intended benefits
Leverage Artificial Intelligence expertise to evaluate technological readiness and resources required to execute the proposed solutions
Make decisions to drive the implementation of Artificial Intelligence initiatives and programs while serving multiple stakeholders
Resolve issues which may arise during development or implementation
Collaborate and consult with peers, colleagues and managers to resolve issues and achieve goals

Required Qualifications

5+ years of Artificial Intelligence Solutions experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
2+ years of hands‑on experience with cloud platforms such as GCP or Azure, and container orchestration technologies including Docker and Kubernetes/Open Shift
2+ years of experience working on platform or ML/AI infrastructure products within regulated environments
2+ years of experience of proven success owning an API or platform with accountability for SLAs/SLOs, including versioning and deprecation strategies, change management, and reliability outcomes
Strong communication skills, with the ability to influence senior stakeholders and clearly explain complex technical concepts to diverse audiences
Working knowledge of LLM/SLM inference stacks, including vLLM, Triton, and Tensor

RT-LLM, as well as batching strategies, KV cache management, quantization techniques (e.g., FP8, INT4), and evaluation frameworks – sufficient to make informed product trade-offs with engineering teams
Familiarity with GPU and platform fundamentals, such as modern GPU architectures (e.g., H100/H200), MIG and NCCL, GPU orchestration tools (NVIDIA/Run:

AI), and Kubernetes/Open Shift AI administration and admission control patterns
Experience building developer‑centric platforms, including APIs, SDKs, and structured release and governance processes
Hands‑on experience with observability and evaluation for GenAI systems, including dashboards, tracing, alerting, and safety and quality metrics
Demonstrated strength in stakeholder management, partnering effectively across Risk, Security, Architecture, and line‑of‑business application teams

Desired Qualifications

2+ years of experience working on platform or ML/AI infrastructure products within regulated environments
2+ years of proven success owning an API or platform with accountability for SLAs/SLOs, including versioning and deprecation strategies, change management, and reliability outcomes
Strong communication skills, with the ability to influence senior stakeholders and clearly explain complex technical concepts to diverse audiences
Working knowledge of LLM/SLM inference stacks, including vLLM, Triton, and Tensor

RT-LLM, as well as batching strategies, KV cache management, quantization techniques (e.g., FP8, INT4), and evaluation frameworks – sufficient to make informed product trade-offs with engineering teams
Familiarity with GPU and platform fundamentals, such as modern GPU architectures (e.g., H100/H200), MIG and NCCL, GPU orchestration tools (NVIDIA/Run:

AI), and Kubernetes/Open Shift AI administration and admission control patterns
Experience building developer‑centric platforms, including APIs, SDKs, and structured release and governance processes

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language