AI Researcher Job San Jose area,California USA,IT/Tech

Position: Staff AI Researcher
JOB DESCRIPTION

About NIO

NIO is a pioneer and a leading company in the premium smart electric vehicle market. Founded in November 2014, NIO's mission is to shape a joyful lifestyle. NIO aims to build a community starting with smart electric vehicles to share joy and grow together with users.

NIO designs, develops, jointly manufactures and sells premium smart electric vehicles, driving innovations in next-generation technologies in autonomous driving, digital technologies, electric powertrains and batteries. NIO differentiates itself through its continuous technological breakthroughs and innovations, such as its industry-leading battery swapping technologies, Battery as a Service, or BaaS, as well as its proprietary autonomous driving technologies and Autonomous Driving as a Service, or ADaaS.

NIO's product portfolio consists of the ES8, a six-seater smart electric flagship SUV, the ES7 (or the EL7), a mid-large five-seater smart electric SUV, the ES6, a five-seater all-round smart electric SUV, the EC7, a five-seater smart electric flagship coupe SUV, the EC6, a five-seater smart electric coupe SUV, the ET7, a smart electric flagship sedan, and the ET5, a mid-size smart electric sedan.

About The Position

We are seeking exceptional AI researchers to join our team at the forefront of Large Language Model (LLM) and Vision-Language Model (VLM) research and turning the research into a production grade, deployable system. This role is ideal for graduating PhD students or recent PhDs with a strong research background and hands-on experience in LLM/VLM design, training, and inference optimization. You will be working on cutting-edge technologies that accelerate the performance, efficiency, and scalability of next-generation foundation models, especially in the context of real-world deployment on advanced computing platforms in electric vehicles (EVs).

As part of our AIOS team, you will explore and invent new methods to improve LLM inference efficiency-ranging from architectural and system-level innovations, LLM compute optimization, and distributed/parallelized execution strategies, to low-level system and kernel-level optimizations. You will have the unique opportunity to conduct high-impact research while collaborating closely with engineering teams to bring your innovations into production systems powering SkyOS across multiple ECU domains.

This role is highly interdisciplinary, bridging AI/ML research, systems design, and hardware-aware optimization, and offers the chance to shape the future of intelligent automotive systems.

Roles and Responsibilities:

* Conduct original applied research on LLM/VLM model architectures, inference acceleration methods, and system-level optimizations.

* Architect and prototype cutting-edge techniques for model architecture and inference acceleration, parallelization, custom kernels, and hardware-software co-design.

* Lead proof-of-concept implementations to evaluate tradeoffs in functionality, latency, throughput, and reliability.

* Collaborate with system and hardware engineers to translate research insights into scalable, high-performance production systems.

* Define the research and technology roadmap for AIOS solutions in the context of model deployment across EV domains.

* Track industry and academic advances in LLM/VLM, and contribute to the organization's research leadership through publications, patents, and participation in the research community.

Must

Qualifications:

* Master or Ph.D. in Computer Science, Electrical/Computer Engineering, Artificial Intelligence, or related fields.

* Strong research and/or hands-on experience in LLMs/VLMs, including Transformer-based architectures and their optimizations.

* Expertise in LLM inference acceleration techniques, such as kernel optimization, quantization, parallelism, and system-level optimization.

* Solid understanding of GPU/NPU architectures, compiler stacks, and AI training/inference frameworks (e.g., PyTorch, Tensor Flow, JAX).

* Proficiency in programming with Python and/or C/C++, with experience in developing high-performance and scalable software systems.

* Demonstrated ability to conduct independent research, publish in top venues, and work across interdisciplinary teams.

* Excellent communication skills, with the ability to convey complex technical ideas clearly.

Preferred Qualifications:

* Ph.D. in Computer Science, Electrical/Computer Engineering, Artificial Intelligence, or related fields (or Ph.D. candidate expecting graduation within 6-12 months).

* Strong track record of publications in AI/ML conferences or journals (e.g., NeurIPS, ICML, ICLR, CVPR, MLSys, ASPLOS).

* Experience with hardware-aware AI model optimization, such as custom CUDA kernels, NPU programming, or compiler/toolchain development.

* Background in distributed training/inference, large-scale system optimization, or embedded/edge AI.

* Familiarity with operating system concepts, computer architecture concepts, and software-hardware co-design.

* Passion for applying AI…


Increase/decrease your Search Radius (miles)



Job Posting Language