BlockBeats news: On June 5, overnight and early this morning, the prominent investment research firm SemiAnalysis released a report indicating that NVIDIA’s next-generation AI server cluster, Rubin NVL72, will make significant adjustments to its memory configuration, reducing the capacity per rack from the originally planned 55 TB to 28 TB. Additionally, most Rubin systems will use 96 GB SOCAMM modules instead of the originally planned 192 GB modules. The report triggered major market turbulence, with Micron’s stock closing down 7.7% and SK Hynix’s stock plunging 8.32% at open.
Market sentiment is generally cautiously optimistic, suggesting that the market may have overreacted.
US KOL Herman Jin stated that the fundamental reason for memory configuration cuts is insufficient supply, not reduced demand. NVIDIA’s switch solutions may also be subject to similar negative news. He reiterated that demand at the model level is the key indicator for determining whether the AI boom has ended.
Another perspective suggests that the market should next focus on whether memory reduction measures are merely transitional. Additionally, as system memory on the CPU side—used to accommodate large contexts (KV Cache)—shrinks, the GPU compute bottleneck will inevitably shift toward the SSD and interconnect sides. CSPs will need to procure more high-performance SSDs or adopt higher-performance in-rack connectivity solutions, benefiting NAND companies such as KIOXIA and SanDisk, as well as optical connectivity companies like LITE, Marvell (MRVL), and Corning.
