OpenAI Considers Reducing Token Prices Amid AI Industry Cost Crisis

When the token price war truly begins, how will the AI industry make money? The entire valuation logic of AI commercialization is at a point that requires rewriting. The era of competing on “cost-performance” and “scarcity” may have arrived. For OpenAI, “the situation has further deteriorated”; analysis suggests that “if OpenAI declines, it could drag down NVIDIA, Oracle, Coreweave, and others.”

The commercial narrative around generative AI is undergoing its most profound self-examination in three years. From acquiring users through subsidies and concealing costs in monthly subscription plans, to billing by token triggering corporate billing crises, the AI industry has achieved a three-stage leap in commercialization over the past three years—yet an impending price war could render the entire monetization model obsolete once again.

According to The Wall Street Journal, OpenAI is considering significantly lowering its token fees for users to win enterprise customers away from competitor Anthropic. According to people familiar with the matter, this move is partly intended to "get ahead" of a similar price reduction that OpenAI expects Anthropic to soon implement. OpenAI’s CEO, Sam Altman, recently acknowledged at an event that the cost of AI usage has become "a huge problem" and stated that the company aims to "help people get more value for less spending."

The timing of this news is particularly sensitive. This week, OpenAI has secretly filed for its IPO, and Anthropic is also in the final stages before going public. Meanwhile, the Bloomberg Silicon Data LLM Token Spending Index has fallen for seven consecutive trading days, marking the longest streak of declines since January this year, reflecting deep market concerns about the sustainability of AI-related costs. The report explicitly states that the price war will directly erode the profit margins of both companies—both of which have already incurred billions of dollars in losses due to the massive computing power required by their AI systems.

OpenAI

At the heart of this discussion is no longer just a price-cutting decision, but a more fundamental question: when the narrative of "the more token consumption, the better" reaches its limit, who will tell the next commercialization story for the AI industry, and how will they tell it?

01 Initial Three Phases: From Monthly Package Subsidies to Token Billing

The commercialization of generative AI has undergone three distinct phases in just three years.

In the first phase, monthly and annual subscription packages set the industry standard. In February 2023, OpenAI launched ChatGPT Plus at a monthly fee of $19.99, pioneering direct consumer payments for large models; Baidu, Alibaba, and Tencent soon followed, making fixed monthly subscriptions a standard component of initial business models.

In the second phase, a full-scale subsidy war erupted. To boost ARR (Annual Recurring Revenue), the key valuation metric for fundraising, companies turned to large-scale subsidies: Google offered students 15 months of free access to Gemini Advanced, OpenAI launched a Team plan at $1 for the first month, ByteDance’s Doubao entered the market with pricing 99.3% below industry rates, and Baidu announced its core models would be free. Subsidies essentially trade losses for growth—according to reports, Microsoft incurs an average monthly loss of over $20 per user on its GitHub Copilot subscription model, with heavy users reaching monthly losses as high as $80.

The third phase is the mandatory switch to usage-based billing. On June 1, 2026, Microsoft announced that all GitHub Copilot plans would officially transition to token-based usage billing, with the $19 monthly fee directly converted into an equivalent token allocation. This change brings to light the true costs long obscured by the subscription model—according to calculations by Reddit users, a single AI-assisted programming session can consume $30 to $40 worth of tokens, exhausting a monthly plan in just one use.

02 Billing Out of Control: When Tokens Cost More Than People

The implementation of token-based usage billing fully reveals the true nature of enterprise AI expenses.

The bill numbers for the enterprise side are staggering. In May 2026, Uber’s Chief Operating Officer, Andrew Macdonald, publicly stated that "there is no line yet" between the growth in token consumption and meaningful product improvements, and coined the term "tokenmaxxing" to describe employees performing valueless tasks solely to inflate usage metrics.

More direct data: Uber exhausted its entire annual token budget in just the first four months of 2026; Salesforce expects to pay approximately $300 million to Anthropic for the full year.

Anthropic’s own developer documentation shows that developers using Claude Code have an average daily cost of approximately $13, with 90% of users spending less than $30 per day—equating to potential annual token costs exceeding $75,600 for a 10-person development team.

The return on investment is equally concerning. After aggregating data from 2,444 companies, the enterprise data platform Entelligence.AI found that for every dollar spent on AI token fees, only 18 cents generated actual user-facing value; 44 cents were spent fixing bugs introduced by the AI itself, 27 cents went toward rework, and 11 cents were consumed by review friction.

OpenAI

In response to out-of-control expenses, enterprises have begun taking proactive measures. Amazon has halted its internal AI usage leaderboard and instructed employees not to use AI just for the sake of using it; Microsoft plans to phase out Claude Code subscriptions for employees in certain key product teams. Goldman Sachs noted that some companies are now spending up to 10% of their total employee labor costs on AI tokens, a proportion that could rise further in the coming quarters. This is not a disappearance of demand, but rather the end of the era of unchecked AI spending.

03 Act Four: The Price War Begins; OpenAI Considers Major Price Cuts

It was against this backdrop that the spark for the price war was ignited.

According to The Wall Street Journal, Altman’s consideration of price cuts was directly triggered by pressure from Anthropic. Anthropic’s revenue has recently surged, and its programming tool, Claude Code, has gained popularity among software engineers, causing the five-year-old startup to surpass OpenAI in valuation for the first time.

However, the cost of this price war will be extremely high. A significant price reduction will further squeeze the already negative profit margins of both companies, leaving very little room for maneuver in the competitive landscape.

Investors have long identified the underlying risk that OpenAI’s and Anthropic’s products are highly substitutable, allowing customers to easily switch from one to the other—meaning that price cuts may temporarily retain customers but cannot truly build a moat; they merely delay market share erosion.

This dilemma is also externally transmitted through the financial cycle between cloud computing giants and AI labs.

According to corporate disclosure documents compiled by The Information, OpenAI and Anthropic together account for more than half of the approximately $2 trillion in future cloud service commitments from Microsoft, Oracle, Google, and Amazon. If price cuts lead to downward revisions in revenue expectations, this chain of transmission will face pressure on both sides.

U.S. neuroscientist and AI expert Gary Marcus said: “This further exposes OpenAI’s fragility and highlights how severe its predicament is. If OpenAI declines, it could drag down companies like NVIDIA, Oracle, and Coreweave. The situation is rapidly deteriorating.”

OpenAI

Bull and bear views are openly clashing on Wall Street. JPMorgan’s TMT analyst, Mark Schilsky, believes current billing concerns are merely "the smallest speed bump on the path to higher spending": if the average cost per million tokens declines while AI payment penetration among U.S. companies continues to rise, total token usage will inevitably increase significantly on a mathematical basis; coupled with agentic AI driving single-task token consumption to several times that of traditional Q&A models, long-term total spending is expected to be substantially higher than current levels.

Goldman Sachs semiconductor analyst Jim Covello holds a more pessimistic view, arguing that the current industry boom has channeled nearly all value to semiconductor companies—a phenomenon that is "unprecedented in history and unsustainable." Once companies confront the true cost-per-unit pricing, the capital flows supporting GPU purchases and model training could reverse.

04 Act Five: The Next Story in Token Economics?

After the price war, the next chapter of AI commercialization has yet to be written, but its outline is beginning to emerge.

Citadel Securities' report provides a directional framework: tiered pricing and scarcity-based pricing. Its core logic is that compute-intensive frontier AI models will not disappear, but will become increasingly concentrated in the hands of a few large enterprises capable of bearing the computational costs; for broader enterprises, simpler models may remain the more productive path until physical constraints are alleviated. This implies that AI usage will become tiered—high-value, complex tasks will continue to use frontier models, while routine and batch tasks will shift toward cheaper or on-device models.

JPMorgan holds a relatively optimistic view: even if the price per token declines, the adoption of agentic AI will cause token consumption per task to double—existing data shows that after business processes are agentized, token consumption per task can increase by up to 3.5 times—overall spending is still expected to continue growing, and current billing concerns may merely be "the smallest speed bump on the path to higher spending."

Nebius Chief Revenue Officer Marc Boroditsky introduced the concept of "valuemaxxing," advocating for the industry to shift from maximizing token consumption to ensuring each token truly generates value. This direction is gradually becoming an industry consensus—but real commercial implementation still requires AI labs to develop a pricing model that accurately reflects true costs while being acceptable to enterprise customers, which remains the core unresolved issue in all current debates.

However, in this price war, the most overlooked variable may be the Chinese model.

According to June data from the U.S. corporate spending platform Ramp, DeepSeek has topped the list of fastest-growing enterprise software subscriptions in the U.S. Ramp’s Chief Economist, Ara Kharazian, emphasized that this is not about on-premises deployment of open-source models: "Companies are directly sending and receiving data through DeepSeek"—a genuine, paid, direct connection. He admitted, "I didn’t expect U.S. companies to use DeepSeek." Third-party estimates indicate that the average API price for DeepSeek V4-Pro is approximately one-tenth that of GPT-5.5 and about one-eleventh that of Claude Opus 4.7.

OpenAI

The battle between OpenAI and Anthropic may ultimately benefit the player that has long embedded "inclusive pricing" into its DNA and has no obligation to report profit margins to IPO investors. This may not be the most popular outcome of this price war, but it is becoming an increasingly hard-to-ignore reality.

This article is from the WeChat public account "Hard AI," authored by Xu Chao.