Mistral AI Launches OCR 4 with 72% Win Rate in Blind Tests and 170 Language Support

iconCryptoBriefing
Share
AI summary iconSummary

Mistral AI just dropped the fourth generation of its optical character recognition model, and the numbers suggest the French AI lab is quietly building one of the most capable document processing tools on the market.

Mistral OCR 4, unveiled on June 23, scored a 72% average win rate in human preference evaluations against competing OCR systems. The tests covered more than 600 real-world documents across over 12 languages.

What makes OCR 4 different

Mistral OCR 4 topped the public OlmOCRBench leaderboard with a score of 85.20, which serves as an independent benchmark for how well models handle real-world document extraction.

The model supports 170 languages across 10 language groups, with particular strength in rare and low-resource languages.

Advertisement

Beyond raw text extraction, OCR 4 introduces several features aimed at making its output actually useful for downstream applications. Paragraph-level bounding boxes tell you exactly where on a page each block of text lives. Typed block labels classify content into categories like titles, tables, and equations. Per-word and per-page confidence scores let developers programmatically flag sections that might need human review.

The output comes in markdown-structured text, which slots neatly into the retrieval-augmented generation (RAG) pipelines that enterprises are building to let AI agents search and reason over their internal documents.

Pricing and deployment

Mistral set API pricing at $4 per 1,000 pages for standard processing and $2 per 1,000 pages for batch jobs.

The model is optimized for single-container deployment, which matters for enterprises with strict data sovereignty requirements. Rather than routing sensitive documents through a third-party cloud API, companies can run OCR 4 on-premises or in sovereign cloud environments.

Early user feedback has highlighted lower latency compared to established competitors when processing structured documents.

The rapid iteration story

Mistral’s pace of development in this space tells its own story. The original Mistral OCR launched in March 2025. OCR 3 followed in December 2025, reportedly achieving a 74% win rate over its predecessor. Now OCR 4 arrives roughly six months later.

What this means for the market

The document processing market has historically been dominated by legacy players who built their businesses on on-premises scanning solutions and enterprise licensing agreements. Companies like ABBYY and Kofax have owned this space for years. More recently, cloud giants including Google, Amazon, and Microsoft have rolled out their own document AI services.

Mistral’s combination of competitive accuracy, aggressive pricing, and flexible deployment options positions OCR 4 as a credible alternative to all of them. The 72% win rate in blind human evaluations is the kind of metric that procurement teams can point to when justifying a vendor switch.

Going from a 74% win rate for OCR 3 over its predecessor to a 72% win rate for OCR 4 against the broader competitive field suggests the gains are real but the benchmarks are getting harder.

Disclaimer: The information on this page may have been obtained from third parties and does not necessarily reflect the views or opinions of KuCoin. This content is provided for general informational purposes only, without any representation or warranty of any kind, nor shall it be construed as financial or investment advice. KuCoin shall not be liable for any errors or omissions, or for any outcomes resulting from the use of this information. Investments in digital assets can be risky. Please carefully evaluate the risks of a product and your risk tolerance based on your own financial circumstances. For more information, please refer to our Terms of Use and Risk Disclosure.