ChainThink reports that on March 17, stablecoin issuer Tether announced the launch of QVAC Fabric, the world’s first cross-platform LoRA fine-tuning framework for Microsoft BitNet (1-bit LLM), enabling billion-parameter language models to be trained and inferred on ordinary hardware, including laptops, consumer-grade GPUs, and smartphones.
The official statement indicates that this framework significantly reduces the memory and computational requirements for training AI models, supporting Intel, AMD, Apple Silicon, and various mobile GPUs such as Adreno, Mali, and Apple Bionic.
During testing, a BitNet model with approximately 125 million parameters was fine-tuned in about 10 minutes on the Samsung S25; a 1-billion-parameter model was fine-tuned in about 1 hour and 18 minutes on the Samsung S25 and about 1 hour and 45 minutes on the iPhone 16. The team even successfully fine-tuned a 13-billion-parameter model on the iPhone 16.
In terms of performance, BitNet models achieve 2 to 11 times faster inference on mobile GPUs compared to CPUs. Additionally, tests show that BitNet-1B reduces GPU memory usage by up to 77.8% during inference and fine-tuning tasks compared to 16-bit models.
Paolo Ardoino stated that this technology aims to reduce reliance on large-scale cloud computing and specialized AI hardware, enabling AI model training to be performed on local devices and laying the foundation for new models such as decentralized AI and federated learning.
