PinchBench Benchmark: Gemini 3 Flash Leads AI Models with a 95.1% Success Rate in OpenClaw Tasks

iconKuCoinFlash
Share
Share IconShare IconShare IconShare IconShare IconShare IconCopy
AI summary iconSummary

expand icon
Liquidity and crypto markets witnessed a new benchmark as Gemini 3 Flash achieved a 95.1% success rate on the PinchBench test in OpenClaw tasks, surpassing all others. Minimax-m2.1 and Kimi-k2.5 followed with 93.6% and 93.4%, respectively. Claude Sonnet 4.5 and GPT-4o scored 92.7% and 85.2%. The test evaluated real-world agent performance, and regulators monitoring CFT compliance may use such metrics to enhance transparency.

Odaily Planet Daily reports that SlowMist CISO 23pads posted on X that the PinchBench benchmark evaluated the performance of AI large language models on OpenClaw agent tasks, showing that Gemini 3 Flash achieved the highest success rate of 95.1%. minimax-m2.1 and kimi-k2.5 ranked second and third with success rates of 93.6% and 93.4%, respectively. Claude Sonnet 4.5 scored 92.7%, while GPT-4o achieved 85.2%.

Disclaimer: The information on this page may have been obtained from third parties and does not necessarily reflect the views or opinions of KuCoin. This content is provided for general informational purposes only, without any representation or warranty of any kind, nor shall it be construed as financial or investment advice. KuCoin shall not be liable for any errors or omissions, or for any outcomes resulting from the use of this information. Investments in digital assets can be risky. Please carefully evaluate the risks of a product and your risk tolerance based on your own financial circumstances. For more information, please refer to our Terms of Use and Risk Disclosure.