Quadric logo

Senior Performance Architect

Quadric
June 23, 2026
Full-time
Remote friendly (Burlingame, California, United States)
Worldwide
$110,000 - $270,000 USD yearly
SoC Architecture Jobs, Level - Senior

Job Title

Senior Performance Architect

Role Summary

Lead performance analysis and optimization across Quadric's hardware/software stack, identifying bottlenecks from high-level C++/Python down to generated assembly and hardware execution. Prototype fixes and coordinate with compiler and hardware teams to validate and drive product improvements.

This is a hybrid role based in the Burlingame, CA office with a regular onsite requirement (minimum 2–3 days per week); candidates must be able to commute to the office.

Experience Level

Senior — the posting specifies 5+ years of performance analysis experience.

Responsibilities

Primary responsibilities include hands-on analysis, prototyping, and cross-team collaboration:

  • Analyze application performance across the full stack: C++/Python source, compiler output, assembly, and hardware execution.
  • Identify and localize performance bottlenecks to code regions, assembly sequences, or architectural limitations.
  • Implement proof-of-concept fixes and optimizations to validate solutions prior to product handoff.
  • Develop and maintain profiling infrastructure, benchmarks, and performance regression tests.
  • Collaborate with compiler engineers to improve code generation and optimization passes.
  • Work with hardware architects to identify microarchitectural improvements and validate performance models.
  • Create performance models that predict workload behavior and guide optimization priorities.
  • Document findings and communicate performance insights to both technical and non-technical stakeholders.
  • Support customer engagements by analyzing customer workloads and recommending optimizations.

Requirements

Must-have technical skills and experience:

  • 5+ years of performance analysis experience.
  • Strong proficiency in C++ and Python; ability to read, reason about, and write optimized code at the assembly level.
  • Hands-on mentality: comfortable implementing prototypes, modifying compiler passes, or building proof-of-concept implementations.
  • Deep understanding of computer architecture: pipelines, caches, memory hierarchies, SIMD/vector execution.
  • Experience with profiling tools (perf, VTune, custom trace analysis) and performance debugging methodologies.
  • Ability to trace performance issues from application behavior down to microarchitectural root causes.
  • Strong analytical and problem-solving skills and the ability to explain complex issues clearly to diverse audiences.
  • Experience working cross-functionally with compiler, runtime, and hardware teams.
  • Able to commute to Burlingame, CA and work onsite a minimum of 2–3 days per week.

Nice-to-have:

  • Experience with ML/AI workloads and frameworks (PyTorch, TensorFlow, ONNX).
  • Background in compiler development or code generation.
  • Experience with GPU, DSP, or custom accelerator architectures.
  • Familiarity with cycle-accurate simulation and performance modeling tools.

Education Requirements

Bachelor's or Master's degree in Computer Science, Computer Engineering, or Electrical Engineering. The posting also specifies 5+ years of relevant performance analysis experience. No certifications or alternative "equivalent experience" language were provided.


About the Company

Company: Quadric

Headquarters: Burlingame, California, United States

Quadric is building the world’s first supercomputer designed for the real-time needs of edge devices. Founded in 2016, the company empowers developers across industries with innovative general-purpose neural processing unit (GPNPU) architecture for neural network workloads. Co-founded by technologists from MIT and Carnegie Mellon, Quadric aims to enable groundbreaking technology development.

Quadric logo

Date Posted: 2026-06-22