Foreign media report that the cost of using AI models may continue to decline in the second half of this year, primarily due to the concentrated rollout of next-generation infrastructure. As Nvidia Blackwell systems enter large-scale deployment, the per-unit costs for model training and inference are falling, expanding the room for AI service providers to lower token prices.
Blackwell is now being rolled out at scale
The article states that the primary factor driving down costs is the large-scale deployment of Blackwell systems in AI data centers. By the second half of this year, these systems will reach significant scale, used for training new models and handling increased inference workloads.
Compared to the previous-generation Hopper platform, deploying Blackwell is more complex. The associated systems require new data center configurations, such as liquid cooling, resulting in a longer installation cycle. However, once deployed, the efficiency gains can be substantial.
Unit output has significantly increased

The research firm SemiAnalysis compared Nvidia’s flagship Blackwell system, the GB300 NVL72, with the previous-generation Hopper HGX200. According to their testing, the older system generated 90 tokens per second per GPU, while the new system achieves 6,000 tokens per second—a 65-fold increase in throughput.
When calculated based on tokens produced per megawatt of electricity, Hopper generates 54,000 tokens per second, while Blackwell generates 2.8 million tokens per second—an improvement of approximately 50 times. The article suggests that this metric is particularly critical for AI data centers amid rising electricity costs.
- GB300 NVL72: Approximately 6,000 tokens per second
- Hopper HGX 200: Approximately 90 tokens per second
- Token output per megawatt: approximately 50x increase
Cost reduced to 12 cents per million tokens
SemiAnalysis also compared the cost per 1 million generated tokens. The tests showed that the Hopper system costs approximately $4.20, while the Blackwell system costs approximately $0.12, representing a reduction of about 35 times.
Based on this analysis, the supply of low-cost tokens will continue to increase as more new models transition to the Blackwell platform for training and inference. Model providers are lowering prices not only due to competitive pressures but also because the underlying compute costs are themselves declining.
OpenAI CEO Sam Altman recently noted that AI costs have become a significant issue, and said the company will seek more ways to deliver greater value to users at lower cost.
Signs of price reductions have begun to appear.
The article also cites Sil’s Token Spending Index, which stood at approximately 2.06 in late May and dropped to 1.75 by June 10. Carmen Li, CEO of Sil, believes this may indicate that token prices for multiple AI models have begun to decline.
The article suggests that if this trend continues, market focus on "token consumption" may shift from reducing usage to expanding scale at lower costs.

