Deep Learning Performance Architect
Join the inference architecture team to model, analyze, and optimize deep learning inference performance on current and next-generation NVIDIA GPUs. The role emphasizes performance prototyping, kernel development, and providing data-driven guidance to hardware and software teams.
Work with architecture, software, and product teams to influence design and implementation for inference products.
Senior β 5+ years of relevant industry experience preferred.
Principal responsibilities include:
Must-have technical skills and experience; nice-to-have items listed separately.
BS, MS, or PhD in Computer Science, Electrical Engineering, Mathematics, or a related technical field β or equivalent practical experience.
Company: NVIDIA
Headquarters: Santa Clara, California, USA
NVIDIA is a global leader in accelerated computing, renowned for its innovative solutions in AI and digital twins that transform diverse industries. The company specializes in networking technologies, providing end-to-end InfiniBand and Ethernet solutions for servers and storage that optimize performance and scalability. NVIDIA serves sectors such as high-performance computing, enterprise data centers, and cloud computing, constantly reinventing its products and services to stay ahead in the market.
