Job Title
Machine Learning Applications and Compiler Engineer, LPX (New College Grad 2026)
Role Summary
Develop compiler and runtime algorithms and optimizations for NVIDIA's LPX inference and compiler stack. The role sits at the intersection of systems, compilers, and deep learning to map neural network workloads onto NVIDIA inference platforms.
The team focuses on end-to-end inference optimization, tooling integration, performance benchmarking, and collaborating with hardware architects to co-design features that improve performance and efficiency.
Experience Level
Entry-level. Targeted at new college graduates (2026) and early-career engineers; suitable for candidates with recent MS or PhD completion or equivalent practical experience.
Responsibilities
Primary responsibilities include implementing and improving compiler and runtime components for inference workloads and collaborating across software and hardware teams.
- Build, maintain, and optimize high-performance runtime and compiler components focused on inference.
- Define and implement mappings of large-scale inference workloads onto NVIDIA systems and hardware.
- Integrate with software ecosystem components (libraries, tooling, interfaces) to enable model deployment across platforms.
- Benchmark, profile, and monitor performance and efficiency metrics to validate compiler-generated mappings.
- Collaborate with hardware architects to provide software-driven feedback and co-design performance features.
- Prototype and evaluate compilation/runtime techniques: graph transformations, scheduling, memory/layout optimizations for spatial processors.
- Communicate technical results; publish or present work at relevant conferences and internal reviews.
Requirements
Must-have technical skills and experience required for the role.
- Strong software engineering and systems-level programming experience in C/C++ and/or Rust; solid CS fundamentals (data structures, algorithms, concurrency).
- Hands-on experience with compiler or runtime development (IR design, optimization passes, code generation).
- Experience with LLVM and/or MLIR, including building custom passes, dialects, or integrations.
- Familiarity with deep learning frameworks (TensorFlow, PyTorch) and portable graph formats such as ONNX.
- Understanding of parallel and heterogeneous compute architectures (GPUs, spatial accelerators, domain-specific processors).
- Proven debugging and performance-analysis skills using profiling, tracing, and benchmarking tools.
- Strong communication and collaboration skills across hardware, systems, and software teams.
Nice-to-have:
- Experience with MLIR-based compilers or multilevel IR stacks for graph-based deep learning workloads.
- Prior work on spatial or dataflow architectures, static scheduling, pipeline or tensor parallelism at scale.
- Contributions to open-source ML frameworks, compilers, or runtime systems; research publications in relevant venues.
- Experience with large-scale distributed inference or training systems and performance modeling.
Education Requirements
Pursuing or recently completed an MS or PhD in Computer Science, Electrical/Computer Engineering, or a related technical field, or equivalent practical experience. (New college graduates and recent degree completers are explicitly targeted.)
About the Company
Company: NVIDIA
Headquarters: Santa Clara, California, USA
NVIDIA is a global leader in accelerated computing, renowned for its innovative solutions in AI and digital twins that transform diverse industries. The company specializes in networking technologies, providing end-to-end InfiniBand and Ethernet solutions for servers and storage that optimize performance and scalability. NVIDIA serves sectors such as high-performance computing, enterprise data centers, and cloud computing, constantly reinventing its products and services to stay ahead in the market.

Date Posted: 2026-05-05