While everyone is focused on fighting over the model layer, a team holding the open-source inference standard, backed by Silicon Valley’s most prestigious seed-round investors, has officially turned its focus toward the new era of AI infrastructure.

Article author and source: MachineHeart

On May 5, AI infrastructure startup RadixArk announced the completion of a $100 million seed round, valuing the company at $400 million post-money. In terms of amount, valuation, and investor lineup, this is the largest early-stage investment in the AI Infra sector so far in 2026.

This round was led by Accel, with Spark Capital as co-lead. Institutional investors include NVIDIA’s NVentures, AMD, MediaTek, Databricks, along with leading firms such as Salience Capital, HOF Capital, Walden Catalyst, A&E Investment, LDVP, and WTT Fubon Family. Nearly all key players across the core hardware and systems layer—from GPUs to CPUs, edge chips to data platforms—are now onboard.

Beyond the top institutional investors, several global technology leaders with backgrounds in Intel, Broadcom, OpenAI, xAI, and PyTorch also participated in this round as angel investors.

The combination of the CEOs of the three hardware giants, the founder of a top-tier model lab, and the creator of PyTorch—bringing all of them together in a single seed round is extremely rare in the history of AI infrastructure. Investors familiar with this space have bluntly stated: this is a bet on the next-generation infrastructure de facto standard.

The world's best reasoning engine, in their hands

The story of RadixArk must begin with an open-source project called SGLang.

Since its emergence in 2023, SGLang has rapidly become one of the de facto standards for open-source large model inference, accumulating over 27K stars on GitHub and being deployed on more than 400K GPUs. Trillions of tokens of production traffic run on SGLang daily, serving users including Google, Microsoft, NVIDIA, Oracle, AMD, LinkedIn, xAI, and Thinking Machines Lab.

Over the past two years, model architectures have undergone dramatic shifts—including MoE, long context, reasoning models, and multimodal fusion. With every architectural transformation, SGLang achieved Day-0 compatibility—the first to introduce a mechanism that supports open-source models upon release, delivering performance that rivals the physical limits of hardware. Investors frequently highlight that SGLang’s blend of rapid iteration and engineering discipline is unmatched among open-source projects.

Behind the underlying discipline is a founding team with deep expertise in systems and algorithms.

CEO Ying Sheng earned her bachelor’s degree from the ACM class at Shanghai Jiao Tong University and her Ph.D. from Stanford University. She is the founder of LMSYS Org and one of the primary creators of SGLang. During her doctoral studies, she conducted research as a visiting scholar at UC Berkeley’s Sky Lab and has previously worked at Databricks and xAI, where she served as head of the inference team. Her work in attention sparsification and KV cache reuse has garnered significant attention in the industry, with SGLang’s early RadixAttention mechanism being one of her notable contributions.

CTO Banghua Zhu earned his bachelor’s degree from the Department of Electronic Engineering at Tsinghua University and his Ph.D. from UC Berkeley, where he studied under machine learning pioneers Michael I. Jordan and Jiantao Jiao. During his doctoral studies, he co-founded Nexusflow, which was later acquired by NVIDIA, where he served as a Principal Research Scientist. His experience includes end-to-end development of industrial-grade training systems, as well as long-term contributions to NVIDIA’s foundational system optimizations and large-scale training infrastructure.

A technical lead from a core hardware vendor evaluated this as the most valuable founding team in AI infrastructure entrepreneurship for 2026: on one side, a research-oriented entrepreneur who controls the de facto standard for open-source inference; on the other, a large model algorithms expert from the core research team of a GPU manufacturer.

Holding the SGLang inference engine, which processes trillions of tokens daily, is already a dream starting point for AI infrastructure entrepreneurship—and this team has more than just this one card up their sleeve.

Day-0 Taming DeepSeek V4 Reinforcement Learning

In addition to the inference engine, RadixArk has also made breakthroughs on the training side.

In November 2025, the team open-sourced the reinforcement learning framework Miles, designed to enhance the stability and efficiency of large-scale RL training, which is now used by over 20 teams for RL training of MoE models.

Between 2025 and 2026, competition in Reasoning, Tool Use, and Agentic capabilities will escalate comprehensively, and every step forward requires a system capable of withstanding massive-scale distributed RL. Industry observers have identified a persistent, repeatedly mentioned but unsolved pain point: what today’s large model teams find most agonizing goes far beyond any single-point optimization. The friction at the boundaries across the entire pipeline—from training to RL to deployment inference—appears nearly optimal in isolation, yet when combined, it suffers from inefficiencies at every turn.

The combination of Miles and SGLang aims to bridge the efficiency gap that current large model teams face across the complete pipeline of training, RL, and inference.

The Day-0 support capability of the new model is a direct reflection of the Infra team's engineering strength.

On April 25, the complex architecture of DeepSeek-V4 was released. On the same day, SGLang and Miles achieved simultaneous support for DeepSeek-V4 inference and RL training. This was made possible through system-level optimizations, including ShadowRadix prefix caching designed for hybrid attention, Flash Compressor capable of on-chip compression in a single pass, and Lightning TopK reducing Top-K latency to 15 microseconds, while also enabling a complete RL pipeline from FP8 inference to BF16 training.

Full-stack consensus endorsement:

Why are the giants all entering the space? What are they anxious about?

NVIDIA, AMD, MediaTek, Broadcom, Intel—the most critical companies at the hardware layer—all appearing in the same seed round is nearly unimaginable in this industry. In fact, hardware manufacturers understand better than anyone that computing power remains expensive and scarce, and simply scaling up hardware is no longer sustainable. What they most urgently need is an open-source inference system that truly decouples from hardware and maximizes chip performance across heterogeneous platforms.

The involvement of Databricks, PyTorch creators, and key figures from OpenAI, Thinking Machines, and xAI signals a strong industry expectation for integrated training-inference infrastructure. Each name in this stellar lineup represents an exceptionally precise bet:

Chen Liwu, CEO of Intel and a revered figure in the semiconductor industry with decades of deep experience.
John Schulman is a former co-founder of OpenAI, a co-founder of Thinking Machines Lab, and one of the pioneers of reinforcement learning.
Soumith Chintala—Co-founder of PyTorch, guardian of global deep learning frameworks.
Igor Babuschkin, former co-founder of xAI, personally built the most complex training systems and hardware platforms in the industry.
Lilian Weng, co-founder of Thinking Machines Lab, has frontline insights into the industrial-scale deployment of AI systems.

When individuals who can independently lead a funding round anywhere choose to appear together on the same cap table, it’s a strong bet on the future.

Infrastructure for everyone: Ensure that the power to build AI is no longer monopolized by a few.

The vision of RadixArk can be summed up in one sentence: to make AI infrastructure a public good as widespread, reliable, and non-monopolized as electricity. This may sound like an idealistic declaration, but in practice, they are turning this statement into reality:

Academic community

Three years ago, a PhD student working on LLM inference optimization typically had only two options: OpenAI’s API, billed per token with no visibility into internal structures; or outdated open-source code with a README stating “works on a single GPU,” years away from the real distributed scenarios described in papers.

SGLang breaks this trade-off—industrial-scale daily throughput with fully open-source code, making it the default baseline for systems research groups at Stanford, Berkeley, CMU, and UW. For researchers working on agents, RadixAttention’s prefix cache organizes shared prefixes into a tree structure, computing identical key-value pairs only once, reducing experiments that previously took two days to just half a day. Citing SGLang in local inference papers has nearly become standard practice.

Startup

A group of engineers who left big tech companies, driven by a deep understanding of a specific vertical niche, started their own venture. They had no million-dollar compute budget, no dedicated infrastructure team—only a strong instinct for their product.

In the past, building production-grade inference pipelines and managing the engineering burden of cross-hardware compatibility often exceeded the capacity of seed-stage companies, consuming vast amounts of time on reinventing the wheel. Now, they can directly deploy inference services with near-state-of-the-art performance on top of SGLang and train domain-specific models with Miles—infrastructure is no longer a bottleneck, freeing up time and resources to focus entirely on what they truly want to build.

Tech giant

Why are giants like Google, Microsoft, and NVIDIA—each with the world’s most powerful internal infrastructure—listed as SGLang users? The answer lies in the structure of this round of investors: five core hardware companies—NVIDIA, AMD, MediaTek, Broadcom, and Intel—have all entered the scene. They understand better than anyone else what a hardware-agnostic, vendor-neutral open-source inference system means for the entire ecosystem. Using an open-source system jointly maintained by the community and supported by multiple hardware vendors is itself a higher-level infrastructure strategy.

RadixArk's official statement is not emotional, but sharp enough:

The next generation of AI should not be restricted by access to private infrastructure. More teams should be able to own their own models, their own systems, and their own future.

This $100 million seed round aims to turn this vision into engineering reality: make SGLang the Day-0 production standard for any new model; turn Miles into an infrastructure-grade framework for large-scale training and RL; and build a managed platform on top of the open-source core that provides top-tier infrastructure capabilities without locking in models or trapping customers.

The vision of RadixArk has never been to replace anyone. It’s about giving an academic lab, a three-person studio, a startup that just secured its seed round, and a trillion-dollar giant—equal footing on the same infrastructure.

If Anthropic in 2023, Mistral in 2024, and Thinking Machines Lab in 2025 each represented a directional bet on the AI model layer, then RadixArk in 2026 is betting on something more fundamental and long-term: truly returning the power to build cutting-edge AI into the hands of enough people.

After securing funding, the team launched a community回馈 initiative: by registering on the platform and retweeting on Twitter, participants will receive free usage credits once the RadixArk hosting platform officially launches. For a team that grew out of the open-source community, this is a tangible way to thank those who have supported SGLang all the way to where it is today—with real, tangible computing power.

Link: platform.radixark.com

RadixArk Secures $100M Seed Round to Build Next-Generation Open AI Infrastructure

The world's best reasoning engine, in their hands

Day-0 Taming DeepSeek V4 Reinforcement Learning

Infrastructure for everyone: Ensure that the power to build AI is no longer monopolized by a few.