NVIDIA logo

Infrastructure Systems Engineer

NVIDIA
June 28, 2026
Full-time
Remote friendly (Santa Clara, California, United States)
Worldwide
$124,000 - $195,500 USD yearly
Test Engineering Jobs, Level - Mid-Career

Job Title

Infrastructure Systems Engineer

Role Summary

Join NVIDIA's Kernel Infrastructure team to manage readiness, configuration, and long-term health of next-generation GPU platforms. You will own the lifecycle phase where early production hardware meets software and ensure systems are stable, tuned, and maintained for engineering teams.

Position is based in Santa Clara, CA and works closely with firmware, hardware design, and platform engineering groups.

Experience Level

Mid-level β€” typically requires 3+ years of relevant experience in systems engineering, infrastructure operations, or hardware validation environments.

Responsibilities

Key duties focus on bringing up early-stage platforms, diagnosing environment-level issues, and maintaining fleet health.

  • Drive early production bring-up and tuning: firmware/VBIOS flashing, core clock and power-state configuration, and system performance tuning.
  • Triage complex system and environment issues; coordinate with firmware, hardware, and platform teams to resolve blockers.
  • Monitor and maintain fleet health: implement health checks, diagnose degradation, and perform manual recoveries when needed.
  • Define and maintain golden system baselines (drivers, firmware, configurations) and manage hardware inventory and allocations to engineering teams.

Requirements

Must-have technical skills and experience required for day-to-day success.

  • Must-have: 3+ years in systems engineering, infrastructure operations, or hardware validation handling early-stage platforms.
  • Must-have: Deep Linux and Windows system administration and strong hardware-to-software debugging skills.
  • Must-have: Proficiency in scripting and automation (Shell, Python, Ansible, or similar).
  • Must-have: Hands-on experience with cluster/queue managers such as Slurm, Kubernetes, or equivalent.
  • Strong written and verbal communication skills and ability to explain technical issues to non-technical audiences.
  • Strong problem-solving skills; self-motivated and collaborative team player.

Nice-to-have qualifications that differentiate candidates:

  • Experience managing HPC clusters at scale.
  • Track record configuring and maintaining bring-up systems and early hardware prototypes.
  • Mechanical aptitude and comfort with hands-on physical work and tools.
  • Demonstrated technical curiosity and drive to innovate.

Education Requirements

Degree in Computer Engineering, Electrical Engineering, Computer Science, or equivalent practical experience.


About the Company

Company: NVIDIA

Headquarters: Santa Clara, California, USA

NVIDIA is a global leader in accelerated computing, renowned for its innovative solutions in AI and digital twins that transform diverse industries. The company specializes in networking technologies, providing end-to-end InfiniBand and Ethernet solutions for servers and storage that optimize performance and scalability. NVIDIA serves sectors such as high-performance computing, enterprise data centers, and cloud computing, constantly reinventing its products and services to stay ahead in the market.

NVIDIA logo

Date Posted: 2026-06-26