Surya OCR 2 achieves 83.3% accuracy with 6.5 billion parameters, setting a new benchmark.

iconKuCoinFlash
Share
Share IconShare IconShare IconShare IconShare IconShare IconCopy
AI summary iconSummary

expand icon
On-chain analysis reveals rising interest in document intelligence as Datalab launches Surya OCR 2 on May 28 (UTC+8). The model achieves 83.3% accuracy on olmOCR-bench with 6.5B parameters, outperforming its 90B-parameter counterpart. Supporting 91 languages, it performs layout, text, and table recognition in a single VLM. On-chain data shows processing of 5.35 pages per second on an RTX 5090 and full local operation on M1 devices. The code is open-sourced under Apache 2.0, with free weights available for startups generating under $5M in revenue. Datalab also offers a paid API for its 40B-parameter Chandra 2 model, including a $5 trial credit.

ME News reports that on May 28 (UTC+8), according to monitoring by Beating, the open-source document intelligence platform Datalab has officially released Surya OCR 2, a new multilingual OCR open-source model. With only 650 million parameters, the new model achieves a score of 83.3% on the authoritative document intelligence benchmark olmOCR-bench, ranking first among models under 3 billion parameters and outperforming the original 9-billion-parameter version, which is approximately 14 times larger in size—achieving a Pareto optimal balance between parameter count and accuracy. Functionally, Surya OCR 2 integrates three core tasks—layout analysis, text recognition, and table detection—into a single vision-language model (VLM), while text line detection and OCR error detection continue to run via separate lightweight models. Users can complete full-page OCR recognition with a single model call, outputting structured HTML code containing bounding boxes and reading order; mathematical formulas are rendered using HTML math tags, and multi-row, multi-column tables are formatted into standard HTML. In terms of multilingual support, the new model achieves an overall pass rate of 87.2% across 91 languages (82.5% for Chinese) and features deep optimizations for damaged documents and handwritten text. For deployment efficiency, Surya OCR 2 supports two inference backends. On NVIDIA GPU systems running Docker with the vLLM backend, a single RTX 5090 GPU achieves an ultra-high throughput of 5.35 pages per second. On Apple devices or standard CPU environments, the system loads the GGUF-formatted weights via llama.cpp to enable fully local, on-device inference on M1 computers. The source code for the new model is open-sourced under the Apache 2.0 license, and the weights are freely available under the OpenRAIL-M license for individuals, academic institutions, and startups with annual revenues under $5 million. Datalab has also simultaneously launched a paid API featuring the stronger 4-billion-parameter Chandra 2 model, offering a $5 credit for trial use. (Source: BlockBeats)

Disclaimer: The information on this page may have been obtained from third parties and does not necessarily reflect the views or opinions of KuCoin. This content is provided for general informational purposes only, without any representation or warranty of any kind, nor shall it be construed as financial or investment advice. KuCoin shall not be liable for any errors or omissions, or for any outcomes resulting from the use of this information. Investments in digital assets can be risky. Please carefully evaluate the risks of a product and your risk tolerance based on your own financial circumstances. For more information, please refer to our Terms of Use and Risk Disclosure.