Job Title
Lead RTL Design Engineer
Role Summary
Lead the microarchitecture definition and RTL implementation for a power-efficient processor/accelerator SoC. The role covers dataflow execution fabric, memory subsystem, on-chip interconnect/NoC, low-power RTL, and peripheral IP integration from architecture spec through synthesis-ready RTL.
Work cross-functionally with architects, microarchitects, DV, physical design, and firmware teams to enable tape-out and silicon bring-up.
Experience Level
Senior β requires substantial hands-on RTL experience; posting specifies 8+ years of RTL design experience and tape-out ownership.
Responsibilities
Own microarchitecture and RTL implementation for assigned blocks and drive system-level integration.
- Define processor and compute-unit microarchitecture: dataflow pipelines, execution units, interfaces, and target PPA.
- Design on-chip interconnects and drive data movement decisions with physical design constraints in mind.
- Specify memory subsystem interfaces, data movement, ordering, and synchronization semantics.
- Architect configuration, scheduling, and execution model for multi-kernel workloads and host interaction.
- Drive low-power strategy: power domains, clocking, gating, UPF-driven flows, and retention strategies.
- Collaborate with compiler/software teams to ensure efficient HW/SW mapping of workloads.
- Author microarchitecture specifications and lead design reviews across stakeholders.
- Mentor RTL engineers, enforce coding/lint standards, and review RTL for microarchitecture risks.
- Participate in PPA analysis: synthesize blocks, review area/timing/power reports, and iterate tradeoffs.
- Coordinate with DV on verification plans and provide directed tests for corner cases and power transitions.
- Support silicon bring-up: DFT/ATPG guidance and RTL-level debug during lab validation.
Requirements
Must-have technical skills and experience.
- 8+ years of RTL design experience with tape-out ownership on processor or accelerator SoC elements (dataflow engines, NoC, memory subsystems, or peripheral integration).
- Deep proficiency in SystemVerilog for synthesis-clean, lint-clean, timing-aware RTL; able to design complex state machines, arbiters, token controllers, and datapaths.
- Solid understanding of parallel execution models (dataflow, SIMD, systolic arrays) and producer-consumer synchronization.
- Hands-on experience with on-chip memory design: SRAM wrappers, scratchpad/TCM, banking, and memory-mapped registers.
- Experience with low-power RTL techniques: UPF flows, clock gating, power domains, retention registers, AON logic.
- Familiarity with standard on-chip bus protocols at the RTL level (AXI, AHB, APB, TileLink, or NoC equivalents).
- Experience taking RTL through synthesis and timing closure; able to read and act on SDC, STA reports, and synthesis QoR summaries.
- Strong written communication skills; able to produce uArch specs and design review materials independently.
- Experience with memory compiler toolchains.
Nice-to-have (preferred):
- Prior ownership of dataflow engines, NPUs, or streaming DSPs with producer-consumer token management.
- Experience co-designing with compiler/graph-optimization teams and familiarity with ONNX/TFLite mapping.
- Familiarity with NVM controller RTL, IoT-class power budgets, formal verification of flow-control logic, and functional safety standards.
- Tape-out credits on edge-AI/IoT/wearable SoCs at advanced nodes (β€12nm).
Education Requirements
Not specified.
Compensation & Benefits
Competitive salary range stated: $160,000β$250,000 (final offer based on experience and location). Company also offers equity and standard benefits such as 401(k) match, company-paid benefits, and paid parental leave.
About the Company
Company: Lumiere Systems
Engineering services firm specializing in semiconductor and ASIC design and verification for ARM-based SoCs. Engages in full verification lifecycle (UVM/SystemVerilog, formal, gate-level simulation) and collaborates with global verification teams and client partners.

Date Posted: 2026-05-24