DeepSeek Permanently Lowers API Prices Amid Rising AI Costs

Article | Luo Chao Channel

DeepSeek announces that the 75% discount on the V4-Pro API will be made permanent, effective globally.

Final pricing structure: The input price has been reduced from $1.74 per million tokens to $0.435 per million tokens, and the output price has been reduced from $3.48 per million tokens to $0.87 per million tokens. For input cache hits across the entire API product line, DeepSeek has implemented an even more significant discount: $0.003625 per million tokens—fully adopting a floor-price model akin to Pinduoduo.

Social media platforms, including X, immediately saw a wave of praise: Liang Wenhong was called the Cyber Bodhisattva of AI, the God of Feng, and Saint Liang. The emotion didn’t stem from the low price alone—DeepSeek has long been dubbed the “Pinduoduo of AI,” offering free services to consumers and low-cost solutions to businesses; the world had grown accustomed to its affordability. But what made this price cut so remarkable is that AI prices everywhere else are rising.

Reports indicate that Liang Wenheng will personally invest up to RMB 20 billion—40% of the total—in DeepSeek’s record-breaking Series A funding round. While most companies prioritize strengthening cash flow and improving financial performance when raising capital, Liang has no intention of attracting investors with commercial promises; instead, he remains committed to open-source development and the pursuit of AGI—and this price cut is truly being delivered as promised. The last time someone so boldly declared they didn’t want to make money was Pinduoduo, whose co-founder explicitly told investors on a call in 2024: “Starting in Q3, our profits will gradually decline and will not rebound in the short term. In the long term, declining profitability is inevitable.” The stock price plummeted.

Sam Altman constantly talks about democratizing AI, but OpenAI is rapidly moving in the opposite direction of its name: CloseAI. Liang Wenhong, however, is actively ensuring that everyone and every business can access AI as普惠ly as possible. But is Liang Wenhong really a benevolent saint? Not at all. He is an entrepreneur—open source and inclusivity are simply strategic business choices, which are rare today and will become even more scarce in the future.

Because: AI is becoming increasingly expensive.

This week, Microsoft canceled its internal Claude Code license because the token-based billing costs became unsustainable. Although Microsoft had heavily invested in OpenAI and provided Azure cloud services to Anthropic, possessing cloud resources coveted by all enterprises, the token costs still hurt deeply. Similarly, Uber’s CTO reported an embarrassing situation to management this past April: the company’s entire AI budget for 2026 had been exhausted in just four months, with 95% of engineers using AI coding tools monthly and 70% of code commits generated by AI. As he put it: “I’m back to the drawing board because the budget I thought I would need is blown away already.”

Large companies are burning through their Token budgets much faster than expected. While some of this is due to employees treating Tokens like they’re free, the real cause of the budget strain is that AI is becoming more expensive. Over the past year, AI software prices in the U.S. have risen by 20% to 37%. Anthropic, OpenAI, and Google have all quietly increased the actual cost of the same AI outputs over the past six months.

(Source: X)

The original popular sentiment was, “The more extensively AI is applied at scale, the higher the level of industrialization, the lower the costs, and the happier companies will be”—but it turned out to be naive.

And this trend will not reverse. Prices are determined by supply and demand, not cost—but the supply and demand dynamics for AI have already flipped completely by 2026. Previously, big companies were begging people to use AI, needing to educate the market and promote the technology; AI was heavily subsidized. How many cups of Qwen奶茶 have you had? Now? People are increasingly adopting AI on their own initiative—“once you take that first sip, you can’t let go.” AI coding, AI documentation, AIGC, and even AI search are becoming ever more widespread. The era of AI subsidies has ended.

The more people use it, the greater the demand, and the tighter the token supply becomes, causing compute shortages to spill over from GPUs to CPUs, storage, and even bandwidth. Intel, Micron, SK Hynix, Samsung Electronics, SanDisk, and domestic players like江波龙 and the "Two Changs" are all joining NVIDIA in reaping the benefits. Where did the semiconductor giants’ doubled revenues in 2026 come from? Not from the triangular investment loop between OpenAI, Oracle, and Microsoft—far from it. The pain felt by enterprises is merely the tip of the iceberg. Meanwhile, AI products like ChatGPT, Claude, Gemini, and Doubao, with their rigidly stratified free-and-paid models, will increasingly leave individual users torn.

It’s like ride-hailing: during peak times, you can ride for free in a luxury car, with capital covering the cost. Once user habits are established, subsidies end, prices return to normal, and you’re back to taking the subway. AI is no different. Against the backdrop of rising token prices across the industry, DeepSeek’s decision to slash prices isn’t just an act of individual benevolence—it demonstrates reverse pricing power: I can operate at such low cost, remain stable, and still deliver top-tier quality.

As long as Liang Wenheng is willing, DeepSeek doesn’t have to be this undervalued. This has led many to worry: Will DeepSeek become the Linux of the AI era—immensely influential but unable to generate substantial profits? Linux has contributed far more to the IT industry than Windows or Android (which itself is built on the Linux kernel), yet it remains open-source and has not spawned a Microsoft or Google. Currently, DeepSeek wields significant influence, but its commercial capabilities fall far short of Silicon Valley’s Big Three, and even lag behind China’s Kimi, MiniMax, and Zhipu. In terms of 2025 revenue among the “Four Greats”: Zhipu (RMB 724 million) > MiniMax (approx. RMB 560 million) > Moonshot AI (approx. RMB 200 million) > DeepSeek (unknown but lower).

Liang Wenfeng made money through AI quantitative trading and personally invested 20 billion in DeepSeek, but a story fueled solely by passion cannot last.

In open-source mode, others can also distill, deploy, and retrain the models, causing DeepSeek’s technological moat to gradually narrow. That’s why you constantly see headlines about “benchmark-chasing”: after Zhipu’s GLM-5.1 was open-sourced, it set a new global record on the SWE-bench Pro benchmark; Xiaomi’s MiMo-V2.5-Pro rose to the top of the global open-source large model rankings. A joint report by MIT and Hugging Face shows that over the past year, open-source models developed in China accounted for 17.1% of global downloads, surpassing the United States’ 15.8% to become number one worldwide.

No wonder more voices in Silicon Valley are saying: the U.S. must have its own version of DeepSeek; it cannot stand by and watch the AI industry replay the stories of Shein, Temu, or TikTok. “If the U.S. doesn’t produce an open-source champion, the world will be governed by whichever nation can deliver the strongest, most stable, cheapest, customizable, scalable open-source models and software that meet both personal and commercial needs.” Discussions about great power competition often sound grandiose, but the underlying rivalry is very real.

The rise of DeepSeek has always been framed within a narrative of domestic substitution. The support for Ascend in V4 is cause for celebration; under the drive of domestic computing power, DeepSeek’s current price competitiveness is merely an appetizer. According to its technical report, after the bulk launch of the Ascend 950 super nodes in the second half of the year, the price of V4-Pro will be significantly reduced—better days are still ahead.

There is also an advantage in advanced AI talent: AI professionals are priced at a “luxury” level globally, but in China, they are relatively more affordable. Lei Jun’s acquisition of Luo Fuli from DeepSeek with a ten-million-yuan salary became news, while at the same time, Zuckerberg was offering $1 billion to recruit talent, including acqui-hires. Yet, the output difference between someone paid $1 billion and someone paid a ten-million-yuan salary is clearly not 700-fold. This disparity in AI talent compensation actually translates into a systemic price differential within the token production ecosystem.

Greater competitiveness also comes from the energy system, which is the first layer of Jensen Huang’s AI five-layer cake.

The limit of AI is computing power; the limit of computing power is electricity. In April 2026, DeepSeek hired senior operations engineers and senior delivery managers for its data center in Ulanqab, Inner Mongolia—indicating its plan to build a "token factory" in the west, pushing cost advantages from the software layer down to the physical layer. Previously, I wrote about Ulanqab when Kuaishou established its data center there: proximity to power plants and a climate conducive to efficient cooling. Moreover, green electricity in western China costs approximately RMB 0.2–0.3 per kWh—just one-fifth to one-quarter of prices in Europe and the United States.

It’s not just western green electricity that is competitive. According to 2025 data from the International Energy Agency, China’s total installed power generation capacity has exceeded 2,300 GW, accounting for approximately 22% of the global total—the highest in the world—while the United States stands at around 1,300 GW. More importantly, China possesses the world’s most comprehensive power structure, including thermal, hydro, wind, nuclear, and solar power. Data shows that China’s industrial electricity prices have long remained between $0.06 and $0.08/kWh, whereas industrial rates in California, USA, are nearing $0.18/kWh, and in some parts of Germany, they exceed $0.25/kWh. This means that training a ten-thousand-GPU cluster naturally costs tens of percentage points less in China than in Europe or the United States.

In the operational costs of large AI models, electricity accounts for as much as 60%-70% of total expenses—not only because running the models consumes power, but also due to the significant energy required for cooling. Even infrastructure giants have built data centers directly underwater, using nearby offshore wind power for electricity and circulating seawater for free cooling. Major initiatives such as "Power Transmission from West to East" and "Computing Power Transfer from East to West" demonstrate strong regional capabilities for allocating electricity and computing resources. Regions like Guizhou, Inner Mongolia, and Ningxia have long served as core nodes in the "Computing Power Transfer from East to West" strategy, and the infrastructure to relocate AI computing centers to the west has already been prepared.

Using Chinese AI is essentially using AI trained on a more competitive energy system—more economical and more accessible AI. This is one reason why Kimi, Minimax, and others saw a surge in overseas revenue after the Spring Festival—not just because their algorithms are stronger, but because they had an electricity cost advantage.

NVIDIA sets the price for high-end computing power, but companies like DeepSeek are gaining control over token pricing. You might say, “You get what you pay for in AI.” Indeed, AI is exactly that—quality correlates with cost. DeepSeek V4 has merely narrowed the gap between open-source and proprietary models to its smallest historical level; the company openly acknowledges the objective performance gap with top-tier models like GPT, and it is not multimodal—it can recognize images but cannot generate them.

But this hasn’t stopped the community from flocking to DeepSeek. The reason: most real-world business scenarios don’t require calling the world’s most powerful model every time. Tasks like consulting, customer service, summarization, translation, code completion, enterprise knowledge bases, and automated workflows don’t demand peak intelligence—they need “good enough + affordable + reliable.” When DeepSeek V4’s inference cost is only about 1% (Flash) to 11% (Pro) of GPT-5.5’s, a company can trigger tens of times more tokens within the same budget, experiment with more prompt chains, and iterate through more agent workflows—ultimately producing even better results. After all, AI is inherently a “probabilistic” game: if it’s cheap enough, why not make do with a decent outcome?

Therefore, the more expensive AI becomes, the more valuable DeepSeek’s affordability becomes, and the more valuable DeepSeek as a company becomes—Liang Wenheng and his investors understand this better than anyone else.