Senior Software SDET Test Development Engineer
Listed on 2026-01-13
-
Software Development
Software Engineer, DevOps, Senior Developer, AI Engineer
Senior Software SDET Test Development Engineer
Join to apply for the Senior Software SDET Test Development Engineer role at NVIDIA
NVIDIA is the world leader in GPU Computing. We are passionate about markets including gaming, automotive, vision, HPC, datacenters and networking in addition to our traditional OEM business. NVIDIA is also well positioned as the ‘AI Computing Company’, and NVIDIA GPUs are the brains powering deep‑learning software frameworks, analytics, data centers, and autonomous vehicles. We have some of the most experienced and dedicated people in the world working for us.
If you are dedicated, forward‑thinking, and hard‑working technical people across countries, this job is for you. NVIDIA is looking for an outstanding individual who thrives in a diverse work environment, has outstanding interpersonal skills, and possesses a strong sense of engagement and continuous process improvement. This candidate must have enterprise server integration, strong Linux experience, reliability testing with various telemetries, scale‑out cluster, test plan development, a track record in developing AI tools and NLP, Dev Ops, and CI/CD experience to join our platform SWQA team.
You’ll Be Doing
- Responsible for the development and execution of NVIDIA HGX/DGX/MGX platform test plan on servers, OS, firmware, and CUDA software stack from design doc.
- Installing and testing various system OS, server firmware, and software stack.
- Drive support for root‑cause analysis on reliability and validation test failures to identify root cause(s) and achieve mitigation.
- Build, develop/debug server and OS level automation front‑end and back‑end framework and tests.
- Review partner and supplier test results and prescribe additional reliability testing on components, servers, and packaging as needed.
- Work in an agile software development team with very high production quality standards.
- Manage bug lifecycle and collaborate with inter‑groups to drive for solutions.
- Bachelor’s Degree (or equivalent experience) in a STEM (Science, Technology, Engineering, Math or Physics) field.
- 5+ years proven experience; or master’s degree.
- Proven years of OS and server‑level automation, CI/CD process and Dev Ops experience using Python, SHELL, Ansible, Jenkins, C/C++, Java, JavaScript.
- Strong server and Linux troubleshooting and debugging experience in a bare‑metal and KVM/VMWare/Hyper‑V environment.
- Good knowledge and hands‑on experience in model testing, AI tools/frameworks (Tensor Flow, PyTorch, etc.), NLP and LLM benchmarking.
- Experience in using AI development tools for test‑plan creation, test‑case development and test‑case automation.
- Strong experience in firmware, BMC/OpenBMC, network protocols, internal/external enterprise storage devices, PCIe buses and devices, IO sub‑devices, CPU and memory, ACPI, UEFI spec, Redfish—huge plus.
- Proven years of experience in Git Hub/Gitlab/Gerrit, PXE, SLURM, Stack/Kubernetes/Docker—huge plus.
- AI related tools, LLM and NLP.
- Experience working with NVIDIA GPU hardware is a strong plus.
- Good to have solid understanding of virtualization in Linux (KVM, Docker orchestrated with Kubernetes).
- Background in parallel programming ideally CUDA/OpenCL is a plus.
With a base salary range of $140,000‑$224,250 for Level 3 and $168,000‑$270,250 for Level 4, you will also be eligible for equity and benefits. Applications will be accepted until January 13 2026.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal‑opportunity employer. We do not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).