ZCube Network Architecture Reduces Costs and Improves Performance in Large Model Inference

iconKuCoinFlash
Share
Share IconShare IconShare IconShare IconShare IconShare IconCopy
AI summary iconSummary

expand icon
The ZCube network upgrade addresses PD separation issues in large model inference, as reported in on-chain news. Developed by ZhiPu, YuXun Network, and Tsinghua University, the architecture is now live in GLM-5.1. It reduces switch and optical module costs by 33%, increases GPU throughput by 15%, and lowers P99 first-token latency by 40.6%.

AIMPACT Message, May 21 (UTC+8): According to monitoring by Beating, in response to the growing structural network congestion challenges in large model PD (Prefill-Decode) separation deployment, Zhipu, Yuxun Network, and Tsinghua University have collaborated to develop and deploy the ZCube networking architecture in the GLM-5.1 coding production environment with a thousand GPUs. As long-context and PD-separated inference become mainstream, cross-node KV Cache transmission has caused severe asymmetry in inference traffic, making traditional ROFT (Rail-Optimized Fat-Tree) architectures highly prone to local hotspots and link conflicts. ZCube eliminates the Spine layer switches and adopts a fully flattened topology (2-hop network diameter), combined with a hybrid single/multi-track access mechanism, achieving end-to-end traffic load balancing across all switches in the network at the architectural level. In benchmark tests on actual production clusters, while keeping GPUs, software stacks, and applications unchanged, the ZCube architecture reduced switch and optical module hardware costs by 33% compared to traditional architectures, increased average GPU inference throughput by 15%, and reduced the P99 first-token latency (TTFT) by 40.6%. (Source: BlockBeats)

Disclaimer: The information on this page may have been obtained from third parties and does not necessarily reflect the views or opinions of KuCoin. This content is provided for general informational purposes only, without any representation or warranty of any kind, nor shall it be construed as financial or investment advice. KuCoin shall not be liable for any errors or omissions, or for any outcomes resulting from the use of this information. Investments in digital assets can be risky. Please carefully evaluate the risks of a product and your risk tolerance based on your own financial circumstances. For more information, please refer to our Terms of Use and Risk Disclosure.