Odaily Planet Daily report: According to an official announcement, Tether has launched a cross-platform BitNet LoRA fine-tuning framework within QVAC Fabric, optimizing the training and inference of Microsoft BitNet (1-bit LLM). This framework significantly reduces computational and memory requirements, enabling the training and fine-tuning of billion-parameter models on laptops, consumer-grade GPUs, and smartphones.
This solution is the first to enable fine-tuning of BitNet models on mobile GPUs (including Adreno, Mali, and Apple Bionic). Testing shows that a 125M parameter model can be fine-tuned in approximately 10 minutes, a 1B model in about an hour, and the approach even scales to 13B parameter models on mobile devices.
In addition, the framework supports heterogeneous hardware such as Intel, AMD, and Apple Silicon, and for the first time enables 1-bit LLM LoRA fine-tuning on non-NVIDIA devices. In terms of performance, BitNet models achieve 2 to 11 times faster inference on mobile GPUs compared to CPUs, while reducing memory usage by up to 77.8% compared to traditional 16-bit models.
Tether stated that this technology has the potential to reduce reliance on high-end computing power and cloud infrastructure, promoting the decentralization and localization of AI training, and providing a foundation for new use cases such as federated learning.
