Deep Learning Performance Architect
Join NVIDIA's Deep Learning Architecture team to design and evaluate high-performance hardware and software for AI workloads. The role focuses on architecture, benchmarking, and tooling to improve parallel compute performance and energy efficiency.
Senior. The posting specifies at least 1+ years of experience with C, C++ and Python; the role is an architect-level position.
Primary responsibilities include architecture development, workload analysis, and tooling for performance engineering.
Key qualifications and preferred skills.
B.Tech. or M.Tech. in Computer Science, Electrical Engineering, Mathematics, or a related discipline (explicitly listed in the posting).
Company: NVIDIA
Headquarters: Santa Clara, California, USA
NVIDIA is a global leader in accelerated computing, renowned for its innovative solutions in AI and digital twins that transform diverse industries. The company specializes in networking technologies, providing end-to-end InfiniBand and Ethernet solutions for servers and storage that optimize performance and scalability. NVIDIA serves sectors such as high-performance computing, enterprise data centers, and cloud computing, constantly reinventing its products and services to stay ahead in the market.
