NVIDIA logo

Deep Learning Performance Architect

NVIDIA
May 14, 2026
Full-time
On-site
Shanghai, China
SoC Architecture Jobs, Level - Senior

Job Title

Deep Learning Performance Architect

Role Summary

Join the inference architecture team to model, analyze, and optimize deep learning inference performance on current and next-generation NVIDIA GPUs. The role emphasizes performance prototyping, kernel development, and providing data-driven guidance to hardware and software teams.

Work with architecture, software, and product teams to influence design and implementation for inference products.

Experience Level

Senior β€” 5+ years of relevant industry experience preferred.

Responsibilities

Principal responsibilities include:

  • Analyze new deep learning networks (including LLMs) to identify performance opportunities and prototype optimizations.
  • Develop high-performance kernel prototypes for current and future GPU architectures.
  • Define and execute measurement setups to evaluate performance, power consumption, and accuracy on chips under test.
  • Collaborate across architecture, software, and product teams to influence next-generation deep learning hardware and software direction.

Requirements

Must-have technical skills and experience; nice-to-have items listed separately.

  • Must-have: 5+ years of professional experience in relevant roles.
  • Excellent C/C++ programming and software build skills.
  • Experience with kernel development and performance tuning on GPUs or other accelerators.
  • Familiarity with deep learning frameworks such as PyTorch, JAX, TensorFlow, or TensorRT and with common AI models (e.g., LLMs, AIGC).
  • Experience with hardware frameworks for deep learning applications.
  • Nice-to-have: Experience optimizing DL workloads; experience with MLIR or AI compiler development.

Education Requirements

BS, MS, or PhD in Computer Science, Electrical Engineering, Mathematics, or a related technical field β€” or equivalent practical experience.


About the Company

Company: NVIDIA

Headquarters: Santa Clara, California, USA

NVIDIA is a global leader in accelerated computing, renowned for its innovative solutions in AI and digital twins that transform diverse industries. The company specializes in networking technologies, providing end-to-end InfiniBand and Ethernet solutions for servers and storage that optimize performance and scalability. NVIDIA serves sectors such as high-performance computing, enterprise data centers, and cloud computing, constantly reinventing its products and services to stay ahead in the market.

NVIDIA logo

Date Posted: 2026-05-14