Tether Open Sources Google's TurboQuant to Reduce AI Memory Use

iconCryptoBriefing
Share
Share IconShare IconShare IconShare IconShare IconShare IconCopy
AI summary iconSummary

expand icon
Tether has open-sourced a production-ready version of Google’s TurboQuant algorithm, aiming to cut AI memory use by up to 5x. The tech is now part of Tether’s QVAC Fabric, a local AI engine. This move supports advanced AI operations on regular devices without performance loss. The update brings fresh AI + crypto news and adds value to on-chain news coverage.

Tether’s AI Research Group has open-sourced a production-ready implementation of TurboQuant, the Google Research algorithm designed to dramatically reduce AI memory requirements, according to a Monday press release.

The technology is now part of QVAC Fabric, Tether’s local AI engine, and includes a complete quantization pipeline, framework integrations, documentation, and deployment profiles for real-world use cases.

The release targets memory consumption, one of the biggest barriers to running advanced AI on local devices. As AI assistants process longer conversations, larger files, and more complex tasks, their KV cache expands and can require substantial hardware resources.

Advertisement

According to researchers, TurboQuant reduces those memory demands by up to 5x while preserving model performance, making it easier to run capable AI systems on laptops, phones, consumer GPUs, and edge devices.

“Google’s research showed that AI memory could be compressed far more efficiently than most people assumed. Our work brings that breakthrough into production software that developers, startups, and users can actually build with,” Tether CEO Paolo Ardoino commented on the release.

According to Ardoino, AI tools should be capable of processing long documents, retaining project context, supporting software development, and working with private data locally rather than routing every task through cloud infrastructure. He said TurboQuant helps make that possible by giving local AI systems greater memory capacity and contextual awareness.

“If long context AI only works inside the largest data centers, then AI will be shaped by whoever owns the most hardware. TurboQuant changes what local AI can do by making memory less of a wall,” he added.

Tether believes the technology can help shift more AI workloads away from centralized cloud services by enabling longer context windows and improved performance on local hardware.

Included in QVAC SDK 0.12.0, the release supports the company’s goal of building AI systems that operate closer to users through personal devices, local networks, and decentralized infrastructure.

Disclaimer: The information on this page may have been obtained from third parties and does not necessarily reflect the views or opinions of KuCoin. This content is provided for general informational purposes only, without any representation or warranty of any kind, nor shall it be construed as financial or investment advice. KuCoin shall not be liable for any errors or omissions, or for any outcomes resulting from the use of this information. Investments in digital assets can be risky. Please carefully evaluate the risks of a product and your risk tolerance based on your own financial circumstances. For more information, please refer to our Terms of Use and Risk Disclosure.