Job Title
High Performance Compute (HPC) Software Engineer
Role Summary
Develop and optimize HPC software and system-level tooling for large-scale Linux clusters used in semiconductor manufacturing. The role collaborates with algorithms, hardware, systems, and operations teams to deliver performance- and power-efficient compute solutions deployed at rack and datacenter scale.
Experience Level
Mid-level (typical experience range: 3β5+ years in systems or HPC software; senior candidates also considered depending on degree and experience).
Responsibilities
Primary responsibilities focus on software performance, cluster tooling, and cross-disciplinary integration between software and hardware.
- Design, develop, and optimize distributed and parallel HPC applications on large Linux clusters (MPI, multithreading, GPU-accelerated pipelines, containerized workloads).
- Profile and optimize application performance and power utilization across CPU, memory, storage, and networking layers; address throughput, latency, and scaling behavior.
- Develop and maintain system-level tooling for cluster bring-up, diagnostics, monitoring, power measurement, and health checks.
- Collaborate with hardware and systems teams to define node, storage, and interconnect requirements and influence CPU/GPU selection, memory sizing, PCIe/NUMA layout, and network topology.
- Participate in HW/SW co-debugging, performance bottleneck analysis, stability investigations, and failure analysis at software/OS/hardware boundaries.
- Contribute to rack- and datacenter-level engineering considerations including power, cooling, cabling, and serviceability; participate in platform design reviews and refresh cycles.
- Produce clear technical documentation describing architecture, deployment flows, and performance assumptions.
Requirements
Must-have technical skills and experience required for day-one effectiveness; preferred items listed separately.
- Proven experience developing HPC or systems software on Linux.
- Proficiency in C++, Java, or another system-level/performance-oriented language.
- Hands-on experience with parallel computing models and tooling (MPI, OpenMP, multithreading).
- Practical experience operating or developing for clusters, servers, or rack-scale systems in lab or production environments.
- Solid understanding of HPC hardware fundamentals: CPUs, memory hierarchies, storage, and networking (Ethernet/InfiniBand).
- Strong cross-domain debugging skills spanning application, OS, and hardware layers.
Nice-to-have:
- GPU computing experience (CUDA, ROCm) and performance tuning for accelerators.
- Experience with containerized HPC environments (Docker, Singularity/Apptainer, Kubernetes in HPC contexts) and performance benchmarking.
- Familiarity with high-speed interconnects, storage architectures, and rack integration (cabling, power distribution, cooling).
- Experience in semiconductor, manufacturing, or other high-reliability systems environments; ability to reason about reliability and failure modes.
Education Requirements
Bachelor's or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, or a related technical field OR equivalent practical experience. The posting also allows doctoral-level candidates; years-of-experience expectations vary by degree (examples provided in source: PhD with 0 years, Master's with ~3 years, Bachelor's with ~5 years).
About the Company
Company: KLA
Headquarters: Chennai, India
KLA is a global leader in diversified electronics for the semiconductor manufacturing industry. The company enables the production of electronic devices by inventing systems and solutions for manufacturing integrated circuits, wafers, and displays. With over 40 years of experience, KLA invests heavily in innovation and R&D to support advanced chip design and manufacturing process optimization, collaborating with top technology providers to deliver future electronic devices.

Date Posted: 2026-05-12