SGLang and AMD collaborate to optimize DeepSeek-R1 inference on the MI355X GPU.

KuCoinFlash

Release Time: 05/28/2026 21:23:15

Summary

On-chain news: SGLang and AMD have optimized DeepSeek-R1 inference on the MI355X GPU, achieving a total cost of $0.169 per million tokens at 129 tokens/s/user. This is 5% cheaper than NVIDIA B200 (Dynamo TRT-LLM) and 40% lower than B200 (SGLang). With 24 MI355X GPUs, throughput reached 2,436 tokens/s/GPU—1.25x better than B200 SGLang using 48 GPUs. Key improvements include MoRI mixed FP4/FP8 quantization, MoRI-IO KV Cache, batch overlap with SDMA, ROCm Specv2 MTP, and CPU streaming. Crypto news continues to highlight hardware advancements in AI and blockchain efficiency.

ME AI message: SGLang, in collaboration with the AMD team, has achieved a highly competitive total cost of ownership for AMD Instinct™ MI355X GPUs running DeepSeek-R1 large model inference through a series of full-stack optimizations. At an interactive latency of 129 tok/s/user, the cost is $0.169 per million tokens—5% lower than the NVIDIA B200 (Dynamo TRT-LLM) solution and 40% lower than the B200 (SGLang) solution. In terms of throughput, 24 AMD GPUs achieve 2,436 tok/s/GPU, which is 1.25 times higher per GPU than the B200 SGLang solution using 48 GPUs. Key optimizations include: MoRI mixed FP4/FP8 quantization for all-to-all communication, MoRI-IO KV Cache backend, batch overlapping with SDMA, Specv2 MTP on ROCm, and CPU streaming optimizations. (Source: AiHot)

Source:Show original

Disclaimer: The information on this page may have been obtained from third parties and does not necessarily reflect the views or opinions of KuCoin. This content is provided for general informational purposes only, without any representation or warranty of any kind, nor shall it be construed as financial or investment advice. KuCoin shall not be liable for any errors or omissions, or for any outcomes resulting from the use of this information. Investments in digital assets can be risky. Please carefully evaluate the risks of a product and your risk tolerance based on your own financial circumstances. For more information, please refer to our Terms of Use and Risk Disclosure.