NVIDIA logo

Senior LLM Agents Architect

NVIDIA
May 21, 2026
Full-time
On-site
Yokne'am Illit, Israel
EDA Jobs, Level - Senior

Job Title

Senior LLM Agents Architect

Role Summary

Hands-on architect and builder of agentic LLM systems that generate, analyze, and optimize GPU compute kernels and support hardware/software co-design. Work closely with GPU architects, verification and performance engineers, and software teams to create end-to-end agent flows for kernel optimization, architectural exploration, and automated performance forensics.

Deliver production-grade agentic workflows integrated with internal services, evaluation backbones, and observability to enable rapid iteration and safe deployment.

Experience Level

Senior β€” requires 8+ years in applied ML/AI or large-scale systems, with 2+ years building agentic or LLM-powered applications in production environments.

Responsibilities

Design, implement, and productize agentic systems that improve GPU kernel performance and support architectural studies.

  • Design and build agent workflows that generate, analyze, and optimize GPU kernels for peak performance on NVIDIA hardware.
  • Encode domain expertise (memory hierarchy trade-offs, occupancy tuning, instruction-level reasoning) into agent orchestration and decision logic.
  • Develop automated performance forensics agents to ingest simulation traces and profiler data (e.g., Nsight) to find bottlenecks and recommend mitigations.
  • Partner with hardware architects to enable rapid what-if analyses across micro-architecture configurations (cache sizing, memory controller, compute unit scaling).
  • Prototype and productize solutions; integrate with internal services, optimize pipelines, and remove system bottlenecks.
  • Establish evaluation infrastructure using offline golden sets and online telemetry; implement guardrails, cost control, and rollback plans.
  • Mentor teams on agent orchestration, prompting, retrieval-augmented generation (RAG), observability, and operational playbooks.

Requirements

Must-have technical skills, production experience, and collaboration abilities.

  • Solid grounding in computer architecture: memory hierarchies, parallelism, pipelining, cache behavior; familiarity with NVIDIA GPU concepts (streaming multiprocessors, warp scheduling, shared/global memory model, occupancy reasoning).
  • Hands-on CUDA programming: writing, profiling, and optimizing GPU kernels; experience with profiler workflows such as Nsight Compute or Nsight Systems.
  • Proven ownership of at least one end-to-end agentic system or LLM application in production (requirements, architecture, implementation, evaluation, hardening).
  • Strong software engineering skills in Python and one systems language (C++ preferred).
  • Proficiency with tool orchestration, RAG pipelines, model adaptation techniques, and building agentic systems.
  • Experience building observability for AI systems: dataset/version management, offline test suites, online telemetry, safety checks, and rollback plans.
  • Excellent communication and facilitation skills; able to align diverse collaborators and document decisions and assumptions.

Nice-to-have:

  • Experience with PyTorch compilation/lowering (torch.compile, TorchDynamo, TorchInductor), Triton, PTX, kernel fusion, or auto-tuning frameworks.
  • Background in performance engineering for HPC or GPU workloads, performance modeling, or hardware simulators.
  • Familiarity with distributed multi-GPU workloads and networking (NVLink, InfiniBand).
  • Experience building domain-specific coding agents or using frontier agentic tools and lower-level agent frameworks (e.g., LangChain).

Education Requirements

B.Sc. in Computer Science or Electrical Engineering is required (as stated in the posting).


About the Company

Company: NVIDIA

Headquarters: Santa Clara, California, USA

NVIDIA is a global leader in accelerated computing, renowned for its innovative solutions in AI and digital twins that transform diverse industries. The company specializes in networking technologies, providing end-to-end InfiniBand and Ethernet solutions for servers and storage that optimize performance and scalability. NVIDIA serves sectors such as high-performance computing, enterprise data centers, and cloud computing, constantly reinventing its products and services to stay ahead in the market.

NVIDIA logo

Date Posted: 2026-05-21