Sapient Open-Sources 1B-Parameter HRM-Text Model with 1,300x Lower Training Cost

AIMPACT Update, May 19 (UTC+8): According to monitoring by Beating, Sapient Intelligence has open-sourced HRM-Text, a 1-billion-parameter (1B) text generation foundational model. This is a purely pre-trained model based on the Hierarchical Reasoning Model (HRM) architecture. By introducing latent space reasoning at the base layer of the architecture, it reduces pre-training computational costs by 130 to 600 times compared to conventional models. Specifically, HRM-Text achieved pre-training using only 40 billion (40B) structured tokens—approximately one-thousandth the data volume of comparable standard models. Official benchmarks show that training the 1B version from scratch requires approximately 46 hours on two 8-GPU H100 servers, with a computational cost of about $1,472; the 0.6B version can be trained on a single node in 50 hours at a hardware cost of roughly $800. The complete engineering framework, including data extraction, sequence packing, and PyTorch distributed training, has been simultaneously open-sourced. The dramatic cost reduction is enabled by a unique dual-timescale recurrent design: the model incorporates two sets of Transformer modules—one fast (lower layer) and one slow (higher layer)—that iteratively alternate on the same input batch and exchange information via state addition. This design allows the model to dynamically extend computational depth by increasing recurrence steps while keeping the total number of physical parameters fixed. The steep drop in pre-training barriers has revived opportunities for low-cost validation of previously abandoned model theories due to prohibitive computational costs. Note that only unaligned, purely pre-trained weights are being released; the model can only perform prefix continuation tasks and cannot be used directly as a question-answering assistant. (Source: BlockBeats)