Rising AI token costs prompt companies to focus on cost controls

CoinDesk reports:

After enterprises widely adopted AI tools, new issues began to surface: it’s not that the models aren’t powerful enough, but that bills are rising too quickly. Several tech and internet companies have found that, despite falling prices per token, total consumption continues to rise rapidly due to the widespread adoption of AI coding, automation assistants, and agent-based tools.

Multiple companies have exhausted their budgets ahead of schedule.

TechCrunch reports that some companies have exhausted their AI budgets well ahead of the 2026 fiscal year. Uber used its entire annual AI coding budget by April; Microsoft revoked access to Claude Code for some developers after opening it for months; and a Priceline employee stated that the standard renewal quote for Cursor has increased four to fivefold compared to previous rates.

This increase is related to the release of stronger models in recent months. Anthropic, OpenAI, and Google have progressively launched new models since November last year that are better suited for agent scenarios, driving continued growth in usage volume. One company even incurred a Claude bill as high as $500 million due to not setting usage limits for employees.

Productivity gains do not necessarily cover costs.

Alexander Embers, Head of Enterprise Business at OpenAI, said that six months ago, customers were primarily concerned with whether the model capabilities were sufficient; now, the focus has shifted to expenditure visibility, auditability, token control, and model efficiency. The question enterprises are asking about AI procurement is shifting from “what can it do?” to “how much did it cost, and was it worth it?”

The industry is also recalculating the return on investment for AI coding tools. A March survey by Faros AI of 20,000 developers found that developer output has increased, but bugs and rework have also risen. Research from the engineering management platform Jellyfish shows that engineers who heavily use AI are approximately twice as productive as low users, but their token consumption is ten times higher.

Heavy AI users have a productivity rate approximately twice that of low users.
The corresponding token consumption is approximately 10 times higher.
A single developer's consumption increased by approximately 18.6 times over 9 months.

Cost management tools are rapidly taking shape

As billing issues grow, the market for tools around AI cost management is also heating up. This week, the Linux Foundation announced the formation of the Tokenomics Foundation, aiming to establish a unified language and set of management standards for AI token expenditures, similar to FinOps in the cloud cost management space.

The organization plans to develop open standards for token usage and billing, unified metrics, and new measurements related to cost efficiency, such as “smart cost per unit” or “tokens per watt.” The official launch is expected in July, with additional members to be announced at the upcoming FinOps X conference.

Meanwhile, both startups and established vendors are accelerating their efforts. Companies like Pay-i and Paid focus on AI-powered cost tracking, measurement, and optimization; Jellyfish, Waydev, and Faros AI offer AI agent monitoring services; Ramp, Datadog, and New Relic are also enhancing their offerings with AI spend management, token-level observability, and GPU monitoring capabilities.

Model routing has become a cost-reduction direction.

Some investors and corporate executives believe that such capabilities will increasingly appear at the application layer or model routing layer. For example, this week, the enterprise AI startup Factory launched a model router that automatically selects the most suitable model for each task to reduce invocation costs. Similar practices are already appearing in some corporate billing systems, where, even when invoking high-end models, the system routes certain requests to more cost-effective models.

Additional context: Goldman Sachs predicts that global token usage will grow 24-fold by 2030. For companies already in the high-investment phase, controlling costs while scaling AI adoption has become a practical challenge for the next stage of deployment.