Chinese telecom operators launch token subscription services for AI models

Article | Silicon Quadrant

When users no longer agonize over whether to upgrade their data plan each month, they might start wondering how many Token services to buy each month.

Tokens will be packaged by operators as standardized services, just like data plans, broadband, and text messages.

Recently, China's three major telecom operators have successively launched Token package products: offering monthly subscription-based Token plans for individual users, and tiered computing power packages for developers and enterprise customers, announcing that dozens to hundreds of large models have been integrated into the platform, with “monthly purchase, multi-model access, and payment via phone bill.”

China Telecom has launched personal and business Token packages, starting at 9.9 yuan per month for 10 million Tokens; local operators such as Shanghai Mobile and Shanghai Telecom have introduced billing models based on quota points or generic Tokens, with Shanghai Mobile offering 400,000 Tokens for 1 yuan.

As operators begin offering Token services, the cost for users to switch between large models will drop significantly, meaning that for large model companies,“user loyalty” will be weakened, and only by “competing harder” can they retain their market share.

In the future, large model providers such as Doubao, Qwen, and DeepSeek will not only compete on “price” and “token quality per unit of energy consumption,” but also on “the capability to deliver higher-value AI application solutions.”

01 What is a Token Service?

To understand Token services, first understand what a Token is.

Computers cannot directly recognize text; they can only recognize 0s and 1s. Therefore, every word, character, voice input, and punctuation mark we enter is converted into 0s and 1s through a specific encoding mechanism.

In the context of large models, digital encodings are first identified, and the number of digits for each character's encoding varies slightly.

Token is the smallest unit of computation used by large models to process information. User inputs, contextual memory, and model outputs are all measured in tokens. The more complex the model invocation, the longer the context, and the deeper the Agent execution chain, the higher the token consumption.

Typically: In English, one token is approximately equivalent to four letters; in Chinese, due to the higher information density of characters, one Chinese character, one punctuation mark, or one phrase often corresponds to one to two tokens.

Since large models process and generate output token by token, the industry sells and bills users for model invocation costs and usage quotas in terms of "per million tokens" or "quota points."

Large model companies currently implement tiered pricing for tokens: ordinary users can use basic modes of large models like Doubao and Qwen for free, while enterprise users with heavy usage can purchase monthly API packages or pay-as-you-go services at different tiers.

Since last year, telecom operators have launched "AI compute supermarkets." Model providers act as "merchant tenants," and operators charge "platform fees + compute fees + channel fees." Users are not buying "operator models," but rather: on the telecom platform, using telecom compute resources to invoke any large model, billed by the token.

In July 2025, China Mobile launched its model service platform, MoMA (Mobile Model Access); in April, China Telecom released the Star TokenHub operations platform; in May, China Unicom launched the “LianTong Xingluo” Token service platform. These platforms have integrated major large models from Baidu, Alibaba, ByteDance, DeepSeek, and others, providing unified APIs, authentication, and billing.

The operator platform supports multiple large models internally; users can seamlessly switch between them by simply changing the model name (Model ID).

02 Why do operators sell Tokens?

The surge in token services is no accident.

First, the billing model has changed. In the traditional cloud computing era, users were accustomed to paying for “server rental time” or “fixed bandwidth” (i.e., paying for compute power at the IaaS layer), purchasing bandwidth speed and time. However, with the advancement of large models, the capabilities offered by different models and the costs required for different tasks vary significantly. For instance, stronger models cost more per token; longer contexts consume more tokens; higher inference complexity leads to greater actual costs. Billing by token aligns the “level of intelligence consumed by the user” with the “computational cost incurred by the provider.”

Second, reduce technical barriers and "cost of experimentation." Developing and deploying large models often requires investments of tens of millions or even hundreds of millions of dollars. For the vast majority of small and medium-sized enterprises and individual developers, building their own models is not feasible. Token services break down and package "Artificial General Intelligence (AGI)" capabilities, allowing developers to simply call APIs and pay token fees without needing to concern themselves with the tens of thousands of GPUs running in the background.

Finally, the urgent demand driven by the explosion at the application layer. Entering 2026, application-layer scenarios such as AI Agents, AI-assisted programming, and multimodal content generation are surging. These applications require frequent "throughput" interactions with underlying large models during daily operations. An automated AI coding tool may consume millions of tokens in a single night. This high-frequency, massive-scale interaction is compelling the market to offer more standardized, stable, and price-competitive token package services.

Over the past two decades, telecom operators' business models have undergone three core unit-of-measurement shifts.

The first stage was the voice era, where carriers sold minutes; the second stage was the mobile internet era, where they sold data in GB; and now, entering the AI era, carriers are beginning to experiment with selling tokens.

Tokens are undergoing an evolution similar to that of traffic. Initially, they were merely technical indicators; then they became units of billing; ultimately evolving into standardized commodities.

The entry of telecom operators signifies that tokens have begun to move beyond the realm of technology and into the consumer ecosystem.

Over the coming years, the way users acquire AI capabilities may undergo a fundamental shift: individual users will purchase "AI monthly plans," enterprises will procure "Token resource pools," home broadband will include AI credits, and government and enterprise dedicated lines will integrate Agent services. Tokens will become a foundational resource, much like electricity, water, or data.

But this does not mean that operators will replace large model manufacturers.

03 How should I buy tokens?

Are token services sold directly by native large model providers, or purchased from operator platforms? What are the current advantages and disadvantages of each business model?

The first is the native model provider model, billed per million tokens. Providers such as OpenAI, Anthropic, DeepSeek, and Qwen commonly use this system. Users pay separately for input tokens and output tokens. Providers like Qwen may use a model of pre-purchasing at the beginning of the month and settling at the end.

The second option is a monthly subscription plan for Token credits offered by telecom operators. For example, Shanghai Telecom offers a plan starting at 9.9 RMB for 10 million Tokens, with additional credits available for overage. They plan to integrate Token benefits into the family’s “Beautiful Home” digital space, enabling one-click payment via phone bill.

This "all-in-one pricing" or "service bundle" model allows Chinese users to purchase large model computing power just like buying data packages.

Overseas markets primarily rely on tiered pricing APIs from native large model companies, while the domestic market has pushed token services into a "package-based" era similar to mobile phone plans.

Currently, both fee models have their advantages, as the Token package user base can be divided into three main types.

First are independent developers and tech enthusiasts (Geeks). They use API interfaces provided by various vendors to build their own personalized AI applications, such as productivity tools, automated translation plugins, and personal knowledge bases.

The second category consists of small and medium-sized enterprises, startups, and B2B independent software vendors (ISVs), who constitute the core customer base for Token services. Whether purchasing Tokens for employees to use in programming, developing AI agents tailored to specific industries, or integrating AI-assisted features into existing enterprise ERP or CRM systems, SMEs need to subscribe to "team-tier Token packages" offered by cloud providers or telecom operators.

The third group consists of professionals and ordinary households with heavy reliance on AI, who need to use AI frequently at home for tasks such as copywriting, coding, or AI-assisted tutoring for their children’s studies.

From a technical economics perspective, the native large model's pure token-based pricing model is more scientifically suited for small and medium-sized enterprises and startups.

The operator’s subscription model offers two advantages: on one hand, independent developers are not locked into a single large model and can freely choose among multiple large models through the platform; on the other hand, token services may quickly become accessible to mainstream consumers, as most people understand what 100 GB of data means, but cannot easily grasp what 10 million tokens represent.

The operator uses a monthly subscription model, which essentially lowers the cognitive barrier. Users don’t need to understand tokens—they can simply start by exploring their needs with a basic plan of 9.9 yuan for 10 million tokens.

As carriers begin offering Token services, "Doubaos" are about to enter a three-tier competition.

From chasing parameters to chasing energy efficiency: For large model companies, it is no longer viable to blindly pursue larger model parameters and higher energy consumption; instead, efforts should be focused on techniques such as model distillation, quantization, and inference optimization—capabilities that deliver higher-quality tokens with lower energy usage.

Price competition will intensify further. After operators aggregate hundreds of models, the cost for users to switch decreases. If Model A raises its price, users can replace it with Model B through the platform. When model capabilities are not significantly different, price becomes the key competitive factor.

The profit center for large model companies will shift. Selling APIs alone offers limited profits; future revenue focus may shift toward agents, industry-specific applications, and enterprise solutions. The models themselves are gradually becoming infrastructure, while the application layer becomes the center of value.

Perhaps a "bilateral market" is emerging: operators control the entry points, while model providers control the capabilities.