Tether Launches TurboQuant to Enable Larger AI Models on Devices

iconCoinEdition
Share
Share IconShare IconShare IconShare IconShare IconShare IconCopy
AI summary iconSummary

expand icon
Tether announced the launch of TurboQuant in the QVAC SDK 0.12.0, a tool that cuts AI memory needs by up to five times. The update enables devices to run larger AI models locally, supporting longer conversations, bigger files, and more complex code projects. Features include text-to-video generation, robot control, coding assistant support, and faster image classification. This AI + crypto news marks a step forward in on-chain news developments.
  • Tether’s TurboQuant cuts AI memory use by up to 5x, helping devices handle longer tasks locally.
  • QVAC 0.12.0 lets developers run larger AI workloads on laptops and phones with less memory strain.
  • TurboQuant tackles AI’s memory bottleneck, enabling longer chats, larger files, and bigger code projects.

Tether has added a new memory optimization tool to QVAC SDK 0.12.0, a move that could help laptops, smartphones, and other devices handle larger workloads locally. Announcing the update on X, CEO Paolo Ardoino said the release includes TurboQuant, a technology that reduces AI memory requirements by up to five times while maintaining nearly the same output quality.

The update focuses on a key limitation for large language models: memory. As conversations and tasks become longer, memory demands increase sharply. TurboQuant reduces that burden, allowing devices to work with larger documents, longer conversations, and more information at once.

The release also adds text-to-video generation, robot control features, coding assistant support, voice processing upgrades, and faster image classification tools.

TurboQuant Targets AI’s Memory Bottleneck

TurboQuant sits at the center of the QVAC SDK 0.12.0 release. The technology compresses the KV cache, a type of working memory that AI models use to keep track of conversations, documents, and other information during a session.

Memory demands rise as users feed more information into a model. Tether said a 4-billion-parameter model processing about 262,000 tokens can require roughly 8 GB of memory for cache alone. Running several sessions at that scale can quickly exceed the limits of many laptops and consumer devices.

TurboQuant aims to reduce that pressure. According to Tether, the technology can shrink KV cache memory requirements by up to five times while preserving nearly the same output quality. As a result, users can work with longer conversations, larger documents, and bigger codebases without relying as heavily on remote computing resources.

QVAC Expands Beyond Language Models

The update includes more than memory improvements. QVAC SDK 0.12.0 adds several new tools aimed at expanding what developers can run on local devices.

Among the additions is support for text-to-video generation through the Wan2.1 model. The platform also introduces a vision-language-action feature that allows developers to build applications for robotic control.

The release further adds a lightweight image classification tool designed for tasks that do not require larger vision models. At the same time, QVAC moved its text-to-speech and transcription systems to its GGML engine, a change that broadens support across major desktop and mobile operating systems.

Developers also gained new options for coding assistants. QVAC now integrates with OpenCode and OpenClaw through a provider package that simplifies model management and deployment.

Related: Multicoin Co-Founder Declares ‘Web3 Is Dead’ Amid Crypto Identity Crisis

Open-Source AI Moves Closer to the Edge

The release shows Tether’s focus on running more computing tasks directly on users’ devices rather than relying entirely on centralized data centers. The company has increasingly focused on software that can operate across personal devices, local networks, and decentralized systems.

“Google’s research showed that AI memory could be compressed far more efficiently than most people assumed. Our work brings that breakthrough into production software that developers, startups, and users can actually build with,” said Ardoino.

He added, “People should be able to ask an AI assistant to read a long document, remember a project, help with code, or work through private information without every task being forced through a remote data center.”

The launch comes as Tether expands its efforts beyond memory optimization tools. Ardoino recently disclosed that the company is developing an open-source peer-to-peer search engine and shared a demonstration of a decentralized Wikipedia search system.

Related: Michael Burry Calls Nvidia’s $5.4B GPU Deal ‘Fugazi’

Disclaimer: The information presented in this article is for informational and educational purposes only. The article does not constitute financial advice or advice of any kind. Coin Edition is not responsible for any losses incurred as a result of the utilization of content, products, or services mentioned. Readers are advised to exercise caution before taking any action related to the company.

Disclaimer: The information on this page may have been obtained from third parties and does not necessarily reflect the views or opinions of KuCoin. This content is provided for general informational purposes only, without any representation or warranty of any kind, nor shall it be construed as financial or investment advice. KuCoin shall not be liable for any errors or omissions, or for any outcomes resulting from the use of this information. Investments in digital assets can be risky. Please carefully evaluate the risks of a product and your risk tolerance based on your own financial circumstances. For more information, please refer to our Terms of Use and Risk Disclosure.