Job NPU Research Engineer,Jobs Berlin Berlin,Stellenangebote in Deutschland,IT/Informationstechnik,Steneg

A leading global technology company driving innovation in AI acceleration and next-generation computing architectures. The organization develops high-performance hardware-software solutions focused on neural processing units (NPUs) and advanced compiler systems. Their products power large-scale AI systems across industries such as cloud computing, edge AI, and high-performance research infrastructure.

Mission

As an NPU Compiler & Framework Engineer, you will lead the co-design of AI frameworks and compiler tool chains tailored for NPU acceleration. Your work will directly impact how large AI models are executed with maximum efficiency, leveraging low-level optimizations, intelligent scheduling, and hardware-software synergy. This role is at the intersection of systems design, compiler architecture, and AI infrastructure, with strong visibility in both academic and open-source communities.

Responsibilities

NPU-Centric Framework and Runtime Design

Design and implement smart cross-layer optimizations for AI compilers and runtimes (e.g., PyTorch, vLLM) targeting NPU workloads.
Automate model transformation, quantization, and adaptive deployment for optimized execution on custom NPU hardware.

Compiler and Toolchain Development

Extend and optimize compiler stacks (LLVM, TVM, GCC) to translate high-level AI models into high-performance NPU code.
Focus on scheduling strategies, memory management, and parallel execution tailored to NPU microarchitectures.

Hardware-Software Co-Design

Collaborate with hardware teams to define ISA extensions, performance counters, and architectural features that enable better software-level optimization.
Participate in shaping the future of AI accelerators through feedback-driven development.

Research & Ecosystem Contribution

Publish results in top-tier systems and machine learning conferences (e.g., ISCA, ASPLOS, MLSys).
Support the developer ecosystem with documentation, tooling, and contributions to open-source AI compiler projects.

Required Qualifications

Master’s degree in Computer Science, Computer Engineering, or related field, plus 3 years of experience in systems software; or a recent PhD in a relevant domain.
Solid knowledge of compiler internals (LLVM, GCC, TVM, XLA) and modern NPU/GPU architecture.
Strong programming skills in Python and C/C++ for system-level development.
Excellent communication skills in English, both written and spoken.

Preferred Experience

Hands‑on experience with AI frameworks such as PyTorch or Tensor Flow.
Familiarity with model optimization techniques, quantization, and graph transformation.
Experience working with DSP/xPU tool chains or specialized accelerators.
Open‑source contributions or publications in relevant conferences.

#J-18808-Ljbffr


Increase search radius (miles)



Sprache der Stellenausschreibung