BitTorrent Launches BTTInferGrid, a Decentralized AI Inference Computing Network

As AI agents are increasingly deployed in enterprise workflows, automated production, and other complex autonomous execution scenarios, the global AI industry has officially transitioned from a “reactive” phase to a new era of “autonomous execution.” The core of industry competition has long moved beyond mere large model parameter comparisons to a focus on real-world execution capabilities, with strong logical reasoning serving as the foundational pillar enabling this transformation.

The paradigm shift in application scenarios has also triggered a fundamental transformation in demand for upstream computing infrastructure: the focus of computing consumption is continuously shifting from model training to business inference—a trend that is irreversible. However, today’s dominant centralized computing systems are exposed to issues such as high operational costs, weak scalability, and insufficient service stability when handling massive, frequent, and highly volatile inference requests, causing the entire AI industry to hit a bottleneck in computing supply.

On June 17, the longstanding decentralized distribution ecosystem BitTorrent launched its strategic product, BTTInferGrid, targeting the AI inference market and building a decentralized computing network. Leveraging a decentralized distributed architecture, the platform efficiently aggregates scattered, idle GPU computing resources from around the world, breaking down barriers between resource suppliers and AI developers. It offers open, easy-to-integrate AI inference computing services with on-chain verifiable results and flexible pay-per-use pricing.

Leveraging the advantages of decentralized technology, BTTInferGrid not only addresses the limitations of traditional centralized computing power in high-concurrency and fluctuating load scenarios but also achieves a transformative breakthrough on the computing power supply side, reshaping the entire computing ecosystem’s resource allocation and flow logic.

Meanwhile, BTTInferGrid is a strategic product developed by BitTorrent through an upgrade of its existing BTFS service. This marks not only a key expansion of BitTorrent’s long-established decentralized resource scheduling capabilities from storage into the compute domain, but also a crucial step in its broader strategy to enter the decentralized AI sector.

The demand structure for computing power is shifting from "training" to "inference": BTTInferGrid reconstructs AI inference computing power supply in a decentralized manner.

BTTInferGrid aims to rebuild the computing power supply system through a decentralized model, addressing issues such as excessively high costs and insufficient supply of AI inference computing power. By reducing costs and improving efficiency, it enhances the inference performance of large models, providing the industry with high-performance, resilient, and cost-effective computing infrastructure.

If 2024 to 2025 was defined by the "thousand-model battle" and parameter arms races dominated by massive GPU clusters in the AI industry, then 2026 marks AI’s formal entry into the "inference era" of large-scale application explosion, driven by the widespread deployment of AI agents. AI inference is the critical step that transforms trained models into real-world applications, commercial value, and everyday services. In simple terms, training is "teaching AI to learn," while inference is "enabling AI to be used in practice"—for example, an autonomous vehicle recognizing a stop sign on a road it has never driven before is a classic inference task. Inference capability directly determines the user experience, operational costs, and commercial value of AI products.

The industry widely agrees that over 70% of computing power resources in the future will be dedicated to inference scenarios. Oracle previously predicted that the market size for inference computing power will eventually surpass that of training computing power. Chinese Academy of Engineering academician Zheng Weimin also noted that the vast majority of current computing power is consumed in daily interactions between users and large models. In terms of cost structure, human labor accounts for only 3% and data for 2% of large model inference costs, while computing power constitutes a staggering 95%. The computing power costs for leading applications are substantial: ChatGPT’s daily inference cost is approximately $700,000, and DeepSeek V3 reaches $87,000.

As AI compute demand shifts from centralized training by a few tech giants to millions of developers across industries for commercial inference scenarios, the criteria for evaluating underlying infrastructure have changed. During the training era, developers primarily focused on the scale and efficiency of centralized compute power. In the inference era, AI services directly serve millions of end users, generating trillions of daily interactions and massive compute consumption—shifting developers’ focus to cost per invocation, response speed, and service stability. Today, compute supply, invocation cost, and service availability have become the core metrics for evaluating AI infrastructure—and key determinants of whether AI applications can be successfully deployed.

However, as reasoning demands surge exponentially, the limitations of mainstream centralized computing systems are becoming increasingly apparent: GPU rental costs continue to rise, platform services frequently experience outages, and many AI applications have been forced to shut down due to prohibitive computing costs. These issues are prominently reflected in the following three areas:

First, the lack of elasticity in compute resource scheduling makes it difficult to adapt to fluctuations in traffic demand, leading to a dilemma between cost and stability: although leading AI companies and cloud providers continue to invest heavily in compute infrastructure, inference demand grows rapidly and exhibits clear peak-and-valley patterns—requests can surge dozens of times during daytime business or marketing peaks, then plummet sharply at night. Centralized data centers lack the ability to dynamically allocate resources, making them ill-suited for such variability: configuring for peak demand results in prohibitively high depreciation costs during off-peak periods, while sizing for average demand leads to service outages during peaks, trapping operators in a catch-22 between “high cost” and “low stability.” Meanwhile, centralized compute incurs additional layers of expenses—including data center construction, power, operations and maintenance, and profit margins—ultimately driving up compute costs and severely limiting the experimentation space for small and mid-sized innovative teams. The market urgently needs a new solution that combines cost efficiency with elastic resource scheduling.

Second, GPU rental prices continue to rise, and the high costs hinder innovation and deployment by small and medium-sized enterprises and developers: although open-source large models (such as Qwen and DeepSeek) have lowered the entry barrier for AI, deploying and running these models still relies on stable, affordable, and easily accessible inference computing power. In reality, GPU rental costs keep increasing—for example, the hourly rental price of the mainstream H100 GPU rose from $1.70 in October 2025 to $2.35 in March 2026, a nearly 40% increase in six months. These elevated costs deter many individual developers and SMEs with promising solutions, trapping them in a situation where they have models but lack computing power, severely suppressing innovation and scalable growth in the AI industry.

Third, a vast amount of idle GPU resources worldwide remain underutilized, creating a severe mismatch between supply and demand: in stark contrast to the market’s acute shortage of computing power, massive quantities of idle high-performance GPU resources are stranded across personal devices, university laboratories, small data centers, and facilities left over from the crypto mining transition. Due to the lack of standardized access channels and efficient scheduling engines, these resources cannot enter the mainstream inference market, resulting in a paradoxical situation where demand faces “severe GPU shortages” while supply suffers from “idle computing power.” This presents significant potential for improved resource utilization and urgently requires resolution of the supply-demand imbalance.

In summary, the current AI inference computing power market faces three structural challenges: centralized supply cannot balance cost and elasticity, computing power rental costs continue to rise and suppress AI innovation, and vast amounts of idle GPU resources remain dormant and unutilized. Addressing these industry-wide challenges, BTTInferGrid leverages decentralized technology to deliver a novel solution to the mismatch between computing power supply and demand.

BTT InferGrid aims to efficiently connect globally dispersed idle GPU resources with a vast number of AI developers through a decentralized approach, fundamentally breaking the monopoly and bottlenecks of centralized computing power. On one hand, the platform aggregates fragmented idle GPU resources to build an open and shared computing infrastructure; on the other, it establishes seamless connections between supply and demand, eliminating access barriers and pricing opacity inherent in traditional centralized models. Leveraging DePIN’s incentive and coordination mechanisms, BTT InferGrid continuously delivers high-cost-performance inference computing power, addressing the core challenges of high computing costs and supply shortages at their source, thereby truly unlocking the inference efficiency and commercial value of large models.

BTTInferGrid: Building a decentralized computing network for AI inference scenarios, redefining computing resource allocation with three key advantages

BTTInferGrid is clearly and precisely positioned to build a decentralized computing network tailored for AI inference scenarios, connecting global idle GPU supply with AI inference demand, and providing a global AI computing service platform featuring open access, verifiable results, and pay-as-you-go pricing.

Specifically, BTTInferGrid leverages the DePIN underlying network mechanism to precisely match computing power supply with the explosive growth in AI inference demand, enabling bidirectional value creation for both sides of the supply and demand equation:

On the supply side of computing power, we efficiently aggregate global fragmented idle GPU resources to build an open and shared computing infrastructure. Leveraging DePIN’s incentive and intelligent scheduling mechanisms, we not only provide computing power holders with a low-barrier, sustainable revenue channel, transforming globally idle "sleeping GPUs" into truly liquid assets, but also ensure stable computing power and elastic scalability, delivering a cost-effective, highly extensible, and secure global inference service.

On the demand side of computing power, BTTInferGrid provides a global inference service that is easy to integrate, enables on-chain verification of results, and charges based on usage—targeting AI developers worldwide. Compared to the high premiums charged by centralized cloud providers, BTTInferGrid offers superior cost efficiency and elastic scalability, helping small and medium-sized tech startups and independent developers reduce business experimentation costs, efficiently validate products, and iterate rapidly—while simultaneously empowering the upstream computing power supply ecosystem.

Thus, BTTInferGrid effectively addresses AI developers' urgent need for low-cost, highly elastic computing power during the "application competition" phase, while also creating a sustainable channel for monetizing vast amounts of globally underutilized hardware resources.

More importantly, the BTTInferGrid platform will successfully establish a self-sustaining flywheel of positive growth: idle GPU nodes will continuously scale up, driving down the cost of inference computing power and attracting more developers to join; rising market demand will further incentivize global computing power providers to participate in the ecosystem. BTTInferGrid redefines computing power supply through a decentralized model, transforming scarce and expensive dedicated AI computing power into an accessible, on-demand public infrastructure layer for AI.

In terms of product performance advantages, most decentralized GPU platforms currently on the market face common issues such as high barriers to computing power integration, insufficient service reliability, and unsustainable economic models. BTTInferGrid addresses these challenges at the architectural level, achieving comprehensive breakthroughs in three key areas—computing power aggregation, service verification, and economic sustainability—thereby establishing a unique core competitive advantage, as outlined below:

1. An open-access computing power supply network that rapidly aggregates global idle GPU resources: Traditional cloud computing has high entry barriers (e.g., requiring compliant data centers, fixed public IP addresses, and expensive switches). BTTInferGrid, however, builds a truly open-access computing power supply network, enabling any entity or individual with idle GPU or other computing resources to seamlessly join, provided they meet basic performance criteria (such as VRAM capacity and computational benchmarks) and network stability requirements. This design significantly lowers the barrier to entry for computing power suppliers, enabling the rapid networked and matrixed aggregation of idle GPU resources worldwide.

2. Verifiable Service Quality and Node Behavior: Solving the Trust Challenge in Decentralization — The biggest pain point in decentralized computing is trust: how to prevent miners from using low-end GPUs to impersonate high-performance ones? How can we ensure the authenticity and reliability of inference results? BTTInferGrid establishes a cross-verifiable闭环 by integrating task scheduling (intelligent distribution), challenge verification (cryptographic sampling), consensus scoring (dynamic reputation scores), and on-chain coordination (smart contract-based rewards and penalties), significantly enhancing the trustworthiness of inference services.

3. Demand-Driven Economic Model for a Sustainable Ecosystem: Early DePIN projects often fall into a death spiral—offering high token emissions to attract nodes into mindless mining, only to face token inflation, price crashes, and node departures due to a lack of real demand. From its inception, BTTInferGrid has been designed to build an economic ecosystem driven by genuine demand—using actual inference requests and node performance as the core basis for incentives. Only when AI developers genuinely pay to invoke models do compute providers earn their primary revenue share and reputation bonuses. This design strongly promotes healthy, balanced growth between supply and market demand, ensuring the long-term health and sustainability of the network ecosystem.

In summary, BTTInferGrid is redefining the allocation of computing power across three dimensions: resource aggregation, service trustworthiness, and value distribution—from an open supply grid that breaks traditional entry barriers and seamlessly integrates any idle GPU worldwide meeting performance standards, to a fully verifiable trust framework built on task scheduling, challenge verification, consensus scoring, and on-chain incentives and penalties, to a demand-driven economic model that completely eliminates speculative bubbles by anchoring incentives to real AI inference requests.

BTTInferGrid will gradually build a new computing power ecosystem driven by real demand.

BTTInferGrid is not merely a simple "compute aggregation," but a sophisticated decentralized compute network integrating AI inference task scheduling and execution, intelligent matching and connection of compute supply and demand, and on-chain resource coordination and settlement.

In BTTInferGrid's decentralized computing power ecosystem, all participants form three core roles centered around the "supply, usage, and verification" of computing power:

· Compute Providers (Miners): Contribute idle GPU resources to receive and execute AI inference tasks; the system automatically distributes rewards based on verified actual work done, task completion quality, and dynamic performance scores.

· Compute demand side (AI developers): BTTInferGrid provides standardized, unified API service interfaces, enabling developers to access globally distributed GPU resources.

· Network Guardians (Validators): Participate in a decentralized verification and rating system, auditing and randomly challenging miner nodes’ computational performance to detect anomalous behavior and maintain network service quality. Meanwhile, validators earn rewards for upholding network integrity, collectively ensuring the network’s fairness and trustworthiness.

In summary, BTTInferGrid provides AI developers with a more cost-effective, highly scalable, and secure AI inference service, effectively alleviating product disruptions and customer churn caused by insufficient computing power. For GPU providers, it unlocks global edge and underutilized hardware resources, establishing a sustainable revenue stream that ensures every unit of computing power realizes its full value in the inference era.

In terms of product implementation, unlike traditional centralized cloud providers that follow a capital-intensive model of “building hardware first and waiting for demand,” DePIN inherently faces dual coordination challenges from the outset: oversupply leads to node idleness and collapse of the token economy, while undersupply harms developer experience and system efficiency. To address this, BTTInferGrid has adopted a clear, robust, and demand-driven phased launch strategy, rejecting unstructured, rapid growth in favor of prioritizing resource utilization, economic sustainability, and steady expansion of the technical architecture.

· Short-term goal (2026): Initiate network cold start, complete integration of core underlying nodes and validation of distributed inference services, gradually expanding the GPU node scale.

· Medium-term goal (2027): Diversify the ecosystem, enhance the stability and privacy security of network services, and support additional AI model formats and inference frameworks, gradually expanding into use cases such as model fine-tuning.

· Long-term goal (2028 and beyond): Become an AI-native foundational infrastructure, building the preferred compute layer for AI agents and automated applications, providing elastic computing power to support large-scale AI use cases, and ultimately enabling seamless collaboration among computing power, distributed storage, and on-chain smart contracts within a unified architecture.

In terms of implementation, BTTInferGrid also adopts a phased evolution strategy. In the initial launch phase, the network will primarily rely on professional GPUs, with miners on the supply side requiring approval to join, while users on the demand side can access inference services through the platform. In the future, it will evolve into a fully open supercomputing grid: supporting various types of GPUs—including consumer-grade, professional-grade, and data center-grade—接入 and pricing will be tiered by performance; mining participation will be open to all, with a staking mechanism introduced to ensure service quality; on the demand side, a unified API interface will be made available, compatible with multiple AI model formats and inference frameworks, offering flexible deployment options.

Currently, BTTInferGrid has successfully integrated with several leading open-source AI large models, including Alibaba Cloud’s Qwen3.6 27B and Qwen2.5 7B Instruct, as well as Meta’s Llama 3.1 8B Instruct. AI developers can flexibly invoke these models according to their specific business needs. In the future, the platform will continue to expand its model ecosystem to provide developers with support for more cutting-edge models.

More importantly, BTTInferGrid is backed by BitTorrent and BTFS’s long-standing expertise, giving it inherent competitive advantages. BitTorrent and its BTFS platform have spent years深耕 in decentralized storage, with BitTorrent alone boasting over 100 million active users and 2 billion installations—successfully validating the feasibility of the DePIN model and cultivating mature capabilities in resource onboarding, token incentives, on-chain settlement, and community operations. As a strategic product in BitTorrent’s AI initiative, BTTInferGrid is built upon the upgraded BTFS infrastructure, enabling seamless transfer of these proven capabilities into the AI inference computing domain and accelerating ecosystem growth.

Leveraging decentralized technology, BTTInferGrid precisely resolves the industry dilemma of coexisting idle and scarce computing power. Its principles of open access, decentralized collaboration, verifiable contributions, and community co-construction not only break through the monopoly of traditional centralized computing power but, through clear product positioning and a solid technical foundation, paint an imaginative vision of a new decentralized global computing landscape. Here, every idle unit of computing power is activated, and every developer can access the intelligent future at an affordable cost.