Over the past 60 years, the semiconductor industry has driven progress by shrinking transistor sizes (Moore’s Law), making them smaller, denser, and cheaper.
But now this path is no longer viable:
- Yields for processes below 7nm have plummeted.
- Photolithography machines are extremely expensive.
- The design cost for a single chip using advanced processes exceeds $1 billion.
- The cost per transistor has increased rather than decreased.
Huawei's semiconductor team has validated a new direction through six years and 381 mass-produced chips:
Stop competing on size—start competing on time.
Propose the τ Scaling theory:
Treat "time" as the core optimization metric, compressing the characteristic time τ across the entire chain—from transistor switching (picoseconds) to data center tasks (seconds)—covering 12 orders of magnitude.
Simply put:
Previously, it was about who could be smaller; now, it’s about who can be faster, with lower latency and higher efficiency.
I. What exactly is τ scaling?
τ represents the delay/time constant for each layer, divided into four layers:
- Transistor: Switching speed
- Circuit: Signal Transmission Delay
- Chip: Computation and Memory Access Latency
- System: End-to-end communication time synchronization
The goal is to jointly optimize τ across the full stack—process, circuit, architecture, and system—all using the same set of metrics, rather than working in isolation.
II. Mobile Implementation: LogicFolding
Without upgrading the manufacturing process, vertically stack the chips and use ultra-precise hybrid bonding to distribute critical paths across multiple layers, effectively adding “floors” to the chip.
- Transistor density: increased by 55% from 155 to 238 million per square millimeter per generation.
- Energy efficiency: Up 41%, clock speed increased by nearly 13%
- SRAM frequency: Up over 40%
- Kirin reaches 3.1 GHz in 2026, with a target of 4 GHz by 2029.
III. AI Data Center Deployment: End-to-End Latency Optimization
80% of the energy consumption and 70% of the cost in AI clusters come from data movement; the core focus is reducing communication time.
1. Unified Bus
Eliminating multiple protocol layers reduces remote access latency from tens of microseconds to approximately 100 nanoseconds—500 times faster.
2. Hi-ONE Optical Interconnection
Single module: 8 Tb/s, copper to fiber upgrade, distance extended from 1 meter to 100 meters, compatible with ten-thousand-GPU clusters.
3. 3D Folding
Solve the issue of 2.5D packaging—where area grows rapidly but interfaces lag—by moving memory, power delivery, and optical ports to the vertical plane, enabling synchronized scaling with compute capacity.
- Prediction: AI hardware integration will increase by more than 100 times by 2035.
Four: Logical and Memory Refusion
In the early days, CPUs and memory developed separately, but in the AI era, data movement is more critical than computation, requiring tight 3D integration of memory and logic—shifting industry influence toward memory and packaging.
Five: Remaining Challenges
- EDA tools must be compatible with 3D stacking designs.
- Optimize process variations between wafers and vertical interconnect losses.
- Must be accompanied by new energy efficiency and benchmark standards.
Conclusion
The era of Moore's Law scaling has ended; the era of time scaling has begun.
You don’t need to obsess over the most advanced lithography machines—performance and efficiency can still be continuously improved through 3D stacking, system architecture, and interconnect optimization.
This will be the core roadmap for semiconductors over the next decade.
