DeepSeek Seeks $300 Million in Funding Amid Strategic Shift

DeepSeek has begun engaging with external capital.

The Information, citing four knowledgeable sources, reported that DeepSeek is seeking at least $300 million in its first external funding round, with a valuation of no less than $10 billion.

If we rewind the clock two years ago, this would have been almost an impossible proposition. During that time, this company was the most counterintuitive presence in China’s entire AI industry.

While everyone else is raising funds, expanding, talking about ecosystems, and fighting for entry points, DeepSeek is deliberately pulling back—staying silent, releasing products infrequently, avoiding the narratives of big tech, and maintaining distance from capital.

Many investors have attempted to reach out, and the feedback has been nearly unanimous: there are no funding plans.

In a highly capital-driven industry, this stance appears to defy industrial logic. But precisely because of this, DeepSeek was once regarded as an outlier—a team attempting to validate a "low-resource path" in the age of AI.

So in this funding signal, what truly matters is not the amount or the valuation, but that it breaks a two-year-old strategic assumption: DeepSeek is no longer trying to isolate itself from the system.

I. DeepSeek's Firewall

Liang Wenfeng's rejection of external capital has deep roots.

Around 2022, the quantitative trading industry faced sustained regulatory pressure, causing QuantConnect's assets under management to shrink by more than half from its peak of hundreds of billions. With a large surplus of GPU clusters and cash on hand, Liang Wenheng once considered offloading computing power through equity investments or partnerships with cloud service providers.

He specifically hired two people to handle strategic investments, reviewed a range of tech projects—low-altitude economy, smart hardware, SaaS—but ended up investing in none of them.

At the time, DeepSeek’s internal assessment was that if others could do it, so could they.

In July 2023, DeepSeek was officially founded. From day one, Liang Wenheng set a clear boundary for the company: no external funding, no equity dilution, and no surrender to anyone else’s commercial timeline.

What he wants to build is more like a pure research institute, pursuing AGI, open-sourcing, and letting the technology speak for itself.

The confidence is genuine.

Back in 2019, he invested 200 million yuan to develop the deep learning training platform Firefly I. Two years later, he poured another billion yuan into building Firefly II, acquiring large quantities of NVIDIA A100s and establishing Huafang as one of China’s few companies with a ten-thousand-GPU cluster.

During the period of maximum chip supply shortages during the pandemic, QuantFund had already stockpiled inventory in advance. In 2025, QuantFund achieved an annual return of 56.6% and generated revenue exceeding 5 billion RMB.

Liang Wenfeng’s wallet is richer than those of most investors in AI startups.

Rich, with cards, and with people. VC money is actually a burden—it comes with side bets, valuation pressure, and constant questions about when to exit. He put it bluntly: VCs manage money for LPs and must make a profit, so their goals simply don’t align.

This path reached its peak in January 2025. R1 was released, with a training cost of approximately $5.6 million and performance approaching OpenAI’s top-tier system.

"Building an equivalent model with one-tenth of the cost" — this story made the industry realize that top-tier model capabilities are not achievable only through extreme resource accumulation.

At that moment, the significance of DeepSeek was rapidly amplified. It offered not just a model, but a possibility — the ability to enter the core competitive arena even without resource advantages.

The firewall not only held strong, but became part of the story.

But the problem with the story is that it needs to be continuously rewritten.

Two, cracks have appeared

The crack didn't appear suddenly; it began growing after R1's release, but the signals at the time were scattered.

People are the first to loosen.

The top model teams share a common trait: their core members have extremely high market value and become prime targets across the market whenever the project enters a lull.

The first to be noticed was Luo Fuli, a key developer of the V3 architecture, who switched to Xiaomi at the end of 2025 to lead the MiMo large model team. Around the same time, Wang Bingxuan, the primary author of the first-generation large language model, joined Tencent; Ruan Chong, a core researcher in multimodal technology, became Chief Scientist at Yuanrong Qixing; and Wei Haoran, the lead author of the OCR series, also left around the Spring Festival this year.

Then there is Guo Daya, born in 1994, who holds a Ph.D. from Sun Yat-sen University. Although he spent only two years at DeepSeek, he was fully involved in the development of nearly all landmark models, including V3, R1, Coder, Math, and Prover. The GRPO algorithm he proposed forms the core technical foundation of R1, and his papers have been cited over 37,000 times—among his peers in China’s AI research community, there is almost no equal.

Just the other day, Guo Daya joined ByteDance, focusing on agents. (Further reading: Zhang Yiming, Xin Da Ya)

Five core R&D leads left in succession within less than a year. The significance of these departures goes beyond the mere loss of personnel; in model development, experience is highly path-dependent, and the departure of key members directly impacts the efficiency and pace of the next iteration.

Why did they leave?

In recruitment circles, it's widely reported that major tech companies are offering DeepSeek’s core technical staff two to three times their current salaries. Starting in September 2025, ByteDance’s Seed team introduced a special stock option allowance, granting monthly options worth between 90,000 and 135,000 RMB depending on level, priced below the internal repurchase rate—effectively a direct discount.

Liang Wenfeng’s management philosophy is almost unique in China’s tech industry: no overtime, no clock-in system, no KPIs. Employees leave at 6 or 7 p.m., and there’s no need to sign in in the morning.

He believes that a person can hardly exceed six to eight hours of high-quality work per day. This culture worked well when DeepSeek was still a small team—giving smart people enough freedom allowed them to naturally gravitate toward the most challenging tasks.

But when someone outside knocks on the door with an eight-digit total, freedom isn’t enough.

Even more deadly are options. DeepSeek has never raised funding and has no market-based valuation anchor. When you offer core team members equity, they can’t convert it into real cash. Big tech companies’ options have strike prices, internal buyback mechanisms, and IPO expectations. The outside world knows DeepSeek is valuable, but no one can say exactly how much—and employees certainly don’t know.

Guo Daya’s departure may be more worth considering than the apparent salary difference. He went to ByteDance to work on agents, while DeepSeek still has no agent product at all—and even at the time of R1’s release, it did not support function calls.

Wanting to work on agents, but your company doesn’t pursue this direction—this misalignment can’t be fixed with any amount of money.

Along with people, the product rhythm has also loosened.

The next-generation flagship V4 was originally scheduled for release around the Spring Festival, then delayed to February, then to March, and the current official timeline is late April. From publicly available information, the delay appears to stem from at least three intertwined factors.

The most direct layer is that the technological roadmap itself has undergone a qualitative change.

V4 is no longer just a baseline model designed to beat benchmarks—it features a trillion-parameter MoE architecture, native multimodality, million-token context, and a brand-new Engram conditional memory mechanism. This is a system-level engineering endeavor, and the complexity of training and validation has surged to a whole new level.

An additional layer of pressure comes from the burden of identity.

DeepSeek's reputation rests on the story of achieving top-tier performance at one-tenth the cost. If V4 only offers marginal performance gains while significantly increasing inference costs, the narrative supporting its valuation and reputation will begin to crack. In some ways, a underwhelming V4 is better left unreleased.

There is another layer that is rarely discussed adequately: deep optimization for domestic chips. Multiple sources revealed in early April that V4 will run entirely on Huawei Ascend 950PR chips, potentially becoming the first flagship large model to operate fully on domestic computing power. This holds immense strategic value, but it is itself an independent, massive engineering effort that consumes a significant portion of R&D bandwidth.

By April 2026, DeepSeek had gone 15 months without a major update, while OpenAI had iterated four to five times, Anthropic had launched Claude 4.5, 4.6, and 4.7, and domestic peers such as Zhipu, Moonshot AI, and ByteDance had made rapid advances on the application layer.

Everyone is speeding up, but DeepSeek remains quiet.

Unlike many teams, DeepSeek did not rapidly expand, aggressively pursue commercialization, or release frequent updates after R1.

At the time, many interpreted this pace as resilience. But looking back today, it was more of an active choice to extend the "laboratory phase" as long as possible.

Practicing restraint is essentially about controlling the pace, but when the external environment accelerates overall, the pace is no longer entirely under your control.

Three: The competition has adopted a new logic

If you look at DeepSeek’s current situation in isolation, it’s easy to attribute it to internal company issues. But the more critical variable comes from outside: over the past 15 months, the competitive landscape across the entire industry has undergone a massive shift.

Earlier, the core of large model competition lay in architecture, training methods, and engineering optimization. After 2026, new factors began to dominate: the scale of compute pools, the density of talent, and the speed of feedback from the application layer. Together, these three determine how fast iteration can proceed.

The revenue structure of leading overseas companies has already indicated the direction.

Anthropic's annual revenue increased from $9 billion to $30 billion in four months, with nearly all of the growth coming from Claude Code, a coding agent. Cursor, a code editor, is valued at $60 billion. GitHub Copilot serves 20 million developers.

Money is flowing toward products that can directly produce code, tools, and applications.

Domestic competitors are rapidly catching up: ByteDance, Alibaba, and Tencent have each launched Coding and Agent product lines; the APIs from Zhipu and Moonshot were overwhelmed during the early-year shrimp boom, precisely because of their bets on Coding.

Clearly, single-point model capabilities remain important, but they are no longer the sole determining factor. Resources, organizational structure, and system capabilities are becoming key variables. Over the past year, China’s leading companies have taken different paths, but with a consistent direction: embedding model capabilities into larger systems.

DeepSeek has the highest popularity among global open-source communities, with 170,000 stars on GitHub, 26,000 enterprise accounts, and 5.7 billion API calls per month. However, it lacks its own IDE, coding tools, agent products, or vertical applications directly paid for by end users.

Liang Wenfeng has always maintained that models are the foundation of everything.

His most recent published papers focus on conditional memory mechanisms and hyperconnection-optimized Transformers, indicating that DeepSeek remains focused on solving foundational issues. This dedication was proven during the R1 era, where they achieved maximum foundational capability with minimal resources and personnel, leaving the rest to the open-source ecosystem.

But when competition expands from foundational model capabilities to a combination of capabilities, products, and ecosystems, having only an engine without the full vehicle won’t get you anywhere.

On the evening of March 29, DeepSeek experienced its longest service outage since launch, lasting over seven hours and affecting hundreds of millions of users. The official team did not provide a reason for the disruption. During the outage, competitor traffic surged significantly, and some enterprise clients began considering multi-platform redundancy strategies.

A single outage won’t kill a company, but it brings a hard truth to light: when user numbers grow from millions to hundreds of millions, infrastructure investment can no longer be offset by efficiency optimizations.

Huanfang's profits are sufficient, but not comfortable.

What can $300 million buy?

A $300 million investment represents less than 3% dilution against a $10 billion valuation—a figure that appears extremely restrained compared to giants like Anthropic and OpenAI.

What Liang Wenfeng truly wants to buy may not be on the balance sheet.

First, options finally have a benchmark. Once the $1 billion valuation is confirmed, the equity held by the core team becomes real money. For a team being steadily poached by big tech companies, this signal is more effective than any salary increase—stopping talent attrition is critical at this stage.

Second, this is the insurance fund for V4.

Fully adapting Huawei Ascend is itself resource-intensive, while media reports indicate that DeepSeek is also using NVIDIA’s latest Blackwell chips to train its next-generation model—chips whose availability is uncertain due to export controls. Running both hardware paths in parallel is multiplying the financial burden.

The most subtle layer is that this is a ticket to enter the second half.

The AI competition has entered a phase driven by four wheels: models, products, ecosystems, and capital. You may have the world’s best engine, but without capital backing and a product ecosystem, you’ll ultimately remain just an advanced supplier in the supply chain, watching others profit from your models.

From the development direction of V4, Liang Wenfeng likely already recognized this. Multiple sources indicate that the V4 roadmap explicitly includes significant advancements in AI search, long-term memory, and coding capabilities—core competencies of the Agent era.

DeepSeek is catching up. Funding is to ensure this course can be completed on time.

Outside the layout:

It's easy for outsiders to interpret this shift as a compromise. But from another perspective, it’s closer to an evolution from an experimental phase to an industrial one.

The cost curve in the AI industry has sharply risen, and talent inflation has surpassed everyone's expectations. Relying on individual style and a single revenue stream to sustain a super unicorn is becoming increasingly unrealistic.

Liang Wenfeng’s past decisions had their logic—controlling scale, avoiding premature commercialization, and maintaining the purity of R&D. Such choices were immensely powerful at certain stages, but the industry’s pace ultimately imposes constraints on everyone.

$300 million, this was Liang Wenfeng's first public acknowledgment of the matter.

This article is from the WeChat public account "Beyond the Layout," author: Hua Hua