AI model innovations are creating new opportunities in AI infrastructure.

Article by Alpha公社

In recent times, the field of AI network communication has become increasingly popular.

On one hand, AI network communications startups in Silicon Valley are frequently securing large rounds of funding; on the other hand, the stock prices of AI network communications companies, particularly those in optical communications, are also rising rapidly in the secondary market.

Why is the热度 of AI network communication rising? At its core, it’s driven by demand: models are growing larger, token consumption is increasing, and computing power is becoming scarce. To extract more computing power at a lower cost from the hardware side, we must look to foundational technologies.

Accelerating communication between chips and between nodes is a path currently being validated to improve the efficiency of the entire computing infrastructure.

Upscale AI has recently raised significant funding. In September 2025, it secured a $100 million seed round led by Mayfield and Maverick Silicon, with participation from StepStone Group, Celesta Capital, Xora, Qualcomm Ventures, Cota Capital, MVP Ventures, and Stanford University.

In January 2026, it secured another $200 million in Series A funding, led by Tiger Global, Premji Invest, and Xora Innovation, with participation from Maverick Silicon, StepStone Group, Mayfield, Prosperity7 Ventures, Intel Capital, and Qualcomm Ventures.

Recently, there were further reports that it is negotiating a new funding round of $180 million to $200 million.

Large parameters, MoE, long context, model innovation drives innovation in AI computing networks

Why has a company less than a year old secured multiple large rounds of funding? This is closely tied to its founding team. In fact, Upscale AI was spun out of Auradine, itself a rising AI infrastructure company that has since been renamed Velaura AI, focusing on delivering proven, breakthrough ultra-low-power computing solutions for cloud, edge, and physical AI applications.

Barun Kar and Rajiv K, courtesy of Upscale AI

Barun Kar, co-founder and CEO of Upscale AI, previously served as COO of Auradine, while Rajiv K, co-founder and Executive Chairman, was formerly CEO of Auradine and is now also CEO of Velaura AI. Puneet Agarwal, CTO of Upscale AI, spent ten years at Broadcom and previously served as CTO of the data center division at Marvell.

Barun Kar and Rajiv K also have extensive experience working at large corporations prior to their previous startup, making this a team with deep industry expertise and years of experience.

Why is AI network communication important? It starts with the underlying technology.

AI computational workloads are highly synchronized. Modern workloads such as large-scale model training, MoE architectures, and distributed inference impose significant synchronization pressure on networks.

During training, the model's parameter gradients must be transmitted across thousands of GPUs in highly synchronized waves; inference computations generate massive fan-out traffic while imposing extremely stringent latency requirements.

When the network lags, the GPU halts and waits, causing latency to rise continuously and leading to a collapse in the efficiency of the computing cluster.

This is an architectural mismatch, not something that can be resolved through optimization.

Traditional networks, designed for generality, now face challenges in AI scenarios where the complexity introduced by supporting diverse workloads has become a hindrance. Deterministic communication and the strong synchronization required by GPU collective communications are surpassing the design limits of traditional networks.

The network required for an AI computing cluster must support deterministic, synchronized, and high-throughput communication at scale.

AI networks must be rebuilt from the ground up, designed around the real needs of Scale-Up and Scale-Out connectivity.

Further refinement and breakdown come down to the model.

Two characteristics of the current model are placing significant pressure on AI compute cluster networks: the exponential growth in model parameter scale, and the ongoing advancement in long context lengths and chain-of-thought reasoning.

Taking the newly released DeepSeek V4 Pro as an example, its size parameter has reached 1.6T, with a context length of 1M. A 1.6T size requires 1.6T of memory, which far exceeds the capacity of a single accelerator card, necessitating distribution across numerous accelerators—making inter-chip communication a rapid bottleneck.

The extremely long context window causes the KV cache to expand dramatically, exceeding the HBM memory capacity of a single GPU. This creates dual pressure on memory capacity and communication bandwidth.

Not just chip-level innovation, but a full-stack transformation

To train and achieve smooth inference with large-parameter, long-context-window models, the real solution is to redefine the "computational boundary," enabling more GPUs to be connected via ultra-high-speed networks with sub-microsecond latency and high-throughput collective communication capabilities, allowing them to be treated as a single "super GPU," thus giving rise to the rack-based form.

Taking NVIDIA's NVL72 as an example, it no longer treats the 72 GPUs as separate devices, but instead operates them as a single coherent machine with memory semantics, featuring internal NVLink bandwidth of up to 130 TB/s.

Here, two connection layers of AI infrastructure are introduced: rack-level GPU interconnection (Scale-Up) and cluster-level network topology interconnection (Scale-Out).

These two levels must work together to enable thousands of GPUs to operate efficiently as a unified distributed computing engine.

For the two connection layers of AI infrastructure, Upscale AI has developed a network architecture tailored for AI. For rack-level AI interconnect (Scale-Up), it features the SkyHammer chip architecture; for cluster-level AI network structure (Scale-Out), it offers Open Ethernet.

SkyHammer is a chip architecture designed to break through the scalability bottlenecks of AI networks, built on open standards to achieve deterministic latency, extreme bandwidth, and predictable performance at hyperscale, enabling GPUs and XPU to operate as a highly synchronized computing engine.

One of its characteristics is deterministic latency, which refers to the time required for data to travel between components within a rack and can be highly predictably controlled.

Upscale AI

Image source: Upscale AI

SkyHammer is built from the ASIC layer up, with holistic co-design across the chip, system, and rack levels to ensure seamless coordination at every layer. Every component has been redesigned: from how data flows within the chip, to how the fabric adapts dynamically under load, to how the supercluster maintains predictability even under extreme synchronization pressure.

It supports emerging standards such as ESUN, UEC, and UALink, while leaving room for future innovations yet to emerge. With its flexible architecture, SkyHammer can seamlessly adapt to new standard definitions without reengineering or compromising performance, enabling interoperability in an open and diverse environment.

Products based on the SkyHammer architecture are scheduled for release in 2026.

Open Ethernet is primarily designed for scale-out AI network architectures. At the cluster level, AI systems require openness, interoperability, and massive bandwidth.

Upscale AI has developed an AI-optimized Open Ethernet network architecture built on NVIDIA Spectrum-X Ethernet switching chips and the SONiC network operating system, with end-to-end support.

By integrating ASIC-native telemetry capabilities, deterministic lossless Ethernet behavior, and industry-standardized network workflows, the system delivers predictable performance, simplified operations, and high reliability at scale.

In short, it connects thousands of GPUs into a unified high-performance network to support distributed training and large-scale inference.

For this project, Upscale AI has joined the NVIDIA Partner Network and is closely collaborating with NVIDIA and its ecosystem partners to align on reference architectures and proven designs, accelerating the deployment of large-scale AI data center networks.

As you can see, Upscale AI’s efforts go beyond simply developing a faster network chip—they achieve tight integration between the chip, system, and software. To run large AI computing clusters, it’s essential to continuously monitor congestion, synchronization behavior, and GPU utilization across the entire network.

This includes: high-performance RDMA networking, adaptive congestion management, GPU-oriented telemetry and observability, and real-time operational visibility across the entire network infrastructure. Upscale AI optimizes all these aspects to build the deterministic network foundation essential for modern AI computing clusters.

The mismatch between model requirements and AI computing infrastructure has created multiple entrepreneurial opportunities.

AI computing infrastructure still has tremendous potential for development. In fact, it may long remain in a state of alternating innovation with AI software, particularly models. When innovations occur in model architecture, leading to structural mismatches in AI computing infrastructure’s hardware or software, new opportunities emerge.

This is the current situation: the MoE architecture, ultra-large parameters, extremely long context windows, and agents' insatiable demand for tokens have collectively created a supply-demand imbalance in AI computing power, while also opening up opportunities for innovation in AI computing infrastructure.

At the level of compute chips, over the past six months we have closely followed Unconventional AI (raised $475 million) and MatX (raised $500 million); in the field of AI-enabled chip design, we have monitored Recursive (raised $300 million) and Cognichip (raised $60 million); naturally, we have also tracked AI data center network interconnectivity, such as Upscale AI (already raised $300 million and planning another $200 million), Eridu (raised $200 million), and Ethernovia (raised $90 million).

China's open-source AI models have achieved global leadership; in particular, the recently released DeepSeek V4. While China still lags behind in AI infrastructure, this also represents significant room for innovation. Observing China's venture capital market, a large number of innovative companies have begun to emerge, some of which have already achieved initial success.