Job Title
Senior System Debug Engineer
Role Summary
Join Intel's Data Center Group (DCG) AI team as a Senior System Debug Engineer responsible for driving end-to-end debug, root-cause analysis, and resolution of complex hardware-software platform issues for AI GPU systems.
The role coordinates across silicon, firmware, software, and customer teams to close bugs, define SLAs for bug lifecycles, improve bring-up and validation processes, and deliver scalable, high-performance AI solutions.
Experience Level
Senior. See Education Requirements for specific years-of-experience guidance related to degree level.
Responsibilities
Primary responsibilities include ownership of platform-level debug and program execution for AI/GPU systems.
- Lead root-cause analysis, issue isolation, and disposition for multidisciplinary platform issues.
- Own the end-to-end bug lifecycle: tracking, accountability, communication, and timely closure.
- Define and enforce SLAs for bug resolution with ingredient owners; monitor and address deviations.
- Coordinate across design, validation, firmware, software, and customer-facing teams to resolve system issues.
- Manage customer escalations: provide timely analysis, clear updates, and technically sound solutions under pressure.
- Lead and mentor debug taskforces; drive critical technical decisions and maintain debug quality and process discipline.
- Identify and implement automation opportunities to improve program execution and validation efficiency.
Requirements
Must-have technical skills and experience for the role.
- Extensive experience diagnosing and resolving Linux kernel and system-level issues; strong familiarity with Linux internals and user-/kernel-space debugging.
- Proven ability to isolate and resolve complex hardware-software interaction issues across multiple components.
- Deep knowledge of RAS, power management, PCIe, performance, security, Ethernet, HBM, and GPU subsystems; proficient with logs, traces, instrumentation, and debug tools used in system bring-up and validation.
- Strong programming skills in Python, C, and C++.
- Solid understanding of GPU architecture, memory hierarchy, performance bottlenecks, and GPU debug methodologies.
- Excellent written and verbal communication skills and ability to collaborate cross-functionally.
Nice-to-have:
- Familiarity with machine learning frameworks such as PyTorch or TensorFlow.
- Experience deploying, debugging, and troubleshooting AI/ML models.
- Knowledge of Intel and ARM platform architectures and associated debug tools.
Education Requirements
Bachelor's degree in Computer Science, Computer Engineering, Electrical Engineering, or a related field is specified. Experience mapping: minimum 8+ years industry experience with a Bachelor's degree; 7+ years with a Master's degree; 6+ years with a PhD.
About the Company
Company: Intel Corporation
Headquarters: Santa Clara, California, USA
Intel Corporation is a leading multinational technology company known for its innovative semiconductor solutions, including microprocessors, artificial intelligence accelerators, and memory products. Headquartered in the United States, Intel focuses on cutting-edge technology and a collaborative working environment, driving advancements in semiconductor manufacturing to meet global demands. The company emphasizes professional development and aims to shape the future of technology through groundbreaking designs.

Date Posted: 2026-04-30