NVIDIA Announces Full Production of Vera Rubin, with Microsoft, Dell, and CoreWeave Among the First to Deploy

According to observations from Beating, at GTC Taipei 2026, Jensen Huang officially confirmed: “Vera Rubin is in full production.” He specifically thanked the Taiwan-based supply chain partners present, noting that Vera Rubin’s supply chain is twice the scale of its predecessor, Grace Blackwell, and that assembly time per rack has been reduced from two hours during the Grace Blackwell era to just five minutes. Huang demonstrated a video showcasing the entire manufacturing process of Vera Rubin live. The system begins with TSMC’s 3nm process and integrates seven new chips, aggregating over 6 trillion transistors at the system level and more than 18,000 components per board, with a total of 1.3 million parts per unit (third-generation MGX rack design). HBM4 memory is sourced from Micron, SK Hynix, and Samsung. The system employs a cable-free PCB midplane design, with all ConnectX-9 SuperNICs and BlueField-4 DPUs fully integrated onto the board to ensure AI factory-grade reliability. The liquid-cooled busbar can handle over 5,000 amperes—equivalent to twenty electric vehicles accelerating at full throttle simultaneously. He also confirmed that Microsoft, Dell, and CoreWeave have already deployed and are running Vera Rubin NVL72 engineering units. Millions of square feet of production capacity are already online supporting Grace Blackwell deliveries and are now being scaled up in parallel for Vera Rubin; mass shipments will be fully ramped up in the second half of 2026. Additionally, Huang showcased the Vera CPU rack (a single liquid-cooled rack housing 256 Vera CPUs for model orchestration and memory scheduling) and the new Groq 3 LPX low-latency inference rack (256 Groq 3 LPU chips with 40 PB/s SRAM bandwidth). The NVL72 is optimized for maximum token generation throughput, while the Groq 3 LPX is designed for minimum token generation latency—complementing each other.