Tether Launches AI Framework for Training Billion-Parameter Models on Mobile Devices

iconChainthink
Share
Share IconShare IconShare IconShare IconShare IconShare IconCopy
AI summary iconSummary

expand icon
On March 17, 2026, Tether announced the release of a cross-platform LoRA fine-tuning framework for Microsoft BitNet (1-bit LLM) on its QVAC Fabric AI platform, marking a significant update in on-chain news. The framework enables training billion-parameter models on consumer hardware such as laptops, smartphones, and GPUs, and is compatible with Intel, AMD, Apple Silicon, Adreno, Mali, and Bionic chips. A 125-million-parameter model was trained in 10 minutes on a Samsung S25, while a 10-billion-parameter model took 1 hour and 18 minutes. BitNet runs 2 to 11 times faster on mobile GPUs compared to CPUs and uses 77.8% less memory than 16-bit models. This AI and crypto development underscores the focus on local training and decentralized AI.

ChainThink reports that on March 17, stablecoin issuer Tether announced the launch of QVAC Fabric, the world’s first cross-platform LoRA fine-tuning framework for Microsoft BitNet (1-bit LLM), enabling billion-parameter language models to be trained and inferred on ordinary hardware, including laptops, consumer-grade GPUs, and smartphones.


The official statement indicates that this framework significantly reduces the memory and computational requirements for training AI models, supporting Intel, AMD, Apple Silicon, and various mobile GPUs such as Adreno, Mali, and Apple Bionic.


During testing, a BitNet model with approximately 125 million parameters was fine-tuned in about 10 minutes on the Samsung S25; a 1-billion-parameter model was fine-tuned in about 1 hour and 18 minutes on the Samsung S25 and about 1 hour and 45 minutes on the iPhone 16. The team even successfully fine-tuned a 13-billion-parameter model on the iPhone 16.


In terms of performance, BitNet models achieve 2 to 11 times faster inference on mobile GPUs compared to CPUs. Additionally, tests show that BitNet-1B reduces GPU memory usage by up to 77.8% during inference and fine-tuning tasks compared to 16-bit models.


Paolo Ardoino stated that this technology aims to reduce reliance on large-scale cloud computing and specialized AI hardware, enabling AI model training to be performed on local devices and laying the foundation for new models such as decentralized AI and federated learning.

Disclaimer: The information on this page may have been obtained from third parties and does not necessarily reflect the views or opinions of KuCoin. This content is provided for general informational purposes only, without any representation or warranty of any kind, nor shall it be construed as financial or investment advice. KuCoin shall not be liable for any errors or omissions, or for any outcomes resulting from the use of this information. Investments in digital assets can be risky. Please carefully evaluate the risks of a product and your risk tolerance based on your own financial circumstances. For more information, please refer to our Terms of Use and Risk Disclosure.