Job Title
Systems Quality and Reliability Engineer - LPU
Role Summary
Join the LPU team to manage RMA and failure analysis for NVIDIA AI/ML products. Lead root-cause investigations, improve field quality metrics, and coordinate FA operations across engineering and contract-manufacturer partners.
Experience Level
Mid-level β requires 5+ years of hands-on systems test, validation, or quality engineering experience.
Responsibilities
Primary responsibilities include leading FA and RMA debug, analyzing field data, and managing FA operations at partners.
- Lead debug and root-cause analysis of field RMAs; coordinate with systems, hardware, software, and operations engineers.
- Scale and manage failure analysis (FA) capabilities and processes across the organization.
- Create FA reports aligned with 8D or similar problem-resolution frameworks.
- Analyze RMA, FA, and repair data to identify trends; raise quality alerts and drive containment and mitigation plans.
- Monitor hardware quality metrics (RMA rates, MTBF, reliability ratio) and drive improvements.
- Manage FA performance at contract manufacturers, ensuring KPIs such as cycle time, fault duplication rates, and fault isolation rates.
- Oversee setup of new products into Failure Analysis operations.
Requirements
Must-have technical skills and experience; nice-to-have items listed separately.
- 5+ years hands-on systems test, validation, or reliability engineering experience.
- Proven practical experience in systems quality and reliability engineering.
- Experienced using lab equipment: oscilloscopes, logic analyzers, power analyzers.
- Experience with reliability tests (e.g., HTOL) and quality tests (e.g., burn-in).
- Strong fault isolation skills and techniques (OBIRCH, DLS/LADA, LVP, LVI).
- Proficiency with high-speed interfaces such as SerDes, PCIe, DDR.
- Proficiency in scripting/programming (Python, Perl, C++, or similar) on UNIX/Linux.
- Knowledge of PCB card and system-level test and debug, and ability to manage factory/CM partners for RMA/FA activities.
Nice-to-have:
- Working knowledge of FA techniques and tools (FIB, SEM, TDR, VNA, CSAM).
Education Requirements
BS or MS in Electrical Engineering, Physics, or a related technical field, or equivalent practical experience.
About the Company
Company: NVIDIA
Headquarters: Santa Clara, California, USA
NVIDIA is a global leader in accelerated computing, renowned for its innovative solutions in AI and digital twins that transform diverse industries. The company specializes in networking technologies, providing end-to-end InfiniBand and Ethernet solutions for servers and storage that optimize performance and scalability. NVIDIA serves sectors such as high-performance computing, enterprise data centers, and cloud computing, constantly reinventing its products and services to stay ahead in the market.

Date Posted: 2026-05-27