NPU Research Engineer
Verfasst am 2026-01-22
-
IT/Informationstechnik
Künstliche Intelligenz Ingenieur, Maschinelles Lernen, Datenwissenschaftler, Systemingenieur
A leading global technology company driving innovation in AI acceleration and next-generation computing architectures. The organization develops high-performance hardware-software solutions focused on neural processing units (NPUs) and advanced compiler systems. Their products power large-scale AI systems across industries such as cloud computing, edge AI, and high-performance research infrastructure.
MissionAs an NPU Compiler & Framework Engineer, you will lead the co-design of AI frameworks and compiler tool chains tailored for NPU acceleration. Your work will directly impact how large AI models are executed with maximum efficiency, leveraging low-level optimizations, intelligent scheduling, and hardware-software synergy. This role is at the intersection of systems design, compiler architecture, and AI infrastructure, with strong visibility in both academic and open-source communities.
ResponsibilitiesNPU-Centric Framework and Runtime Design
- Design and implement smart cross-layer optimizations for AI compilers and runtimes (e.g., PyTorch, vLLM) targeting NPU workloads.
- Automate model transformation, quantization, and adaptive deployment for optimized execution on custom NPU hardware.
- Extend and optimize compiler stacks (LLVM, TVM, GCC) to translate high-level AI models into high-performance NPU code.
- Focus on scheduling strategies, memory management, and parallel execution tailored to NPU microarchitectures.
- Collaborate with hardware teams to define ISA extensions, performance counters, and architectural features that enable better software-level optimization.
- Participate in shaping the future of AI accelerators through feedback-driven development.
- Publish results in top-tier systems and machine learning conferences (e.g., ISCA, ASPLOS, MLSys).
- Support the developer ecosystem with documentation, tooling, and contributions to open-source AI compiler projects.
- Master’s degree in Computer Science, Computer Engineering, or related field, plus 3 years of experience in systems software; or a recent PhD in a relevant domain.
- Solid knowledge of compiler internals (LLVM, GCC, TVM, XLA) and modern NPU/GPU architecture.
- Strong programming skills in Python and C/C++ for system-level development.
- Excellent communication skills in English, both written and spoken.
- Hands‑on experience with AI frameworks such as PyTorch or Tensor Flow.
- Familiarity with model optimization techniques, quantization, and graph transformation.
- Experience working with DSP/xPU tool chains or specialized accelerators.
- Open‑source contributions or publications in relevant conferences.
Um nach Stellen zu suchen, sie anzusehen und sich zu bewerben, die Bewerbungen aus Ihrem Standort oder Land akzeptieren, klicken Sie hier, um eine Suche zu starten: