PinchBench Benchmark: Gemini 3 Flash Leads AI Models with a 95.1% Success Rate in OpenClaw Tasks

iconKuCoinFlash
Share
AI summary iconSummary

Odaily Planet Daily reports that SlowMist CISO 23pads posted on X that the PinchBench benchmark evaluated the performance of AI large language models on OpenClaw agent tasks, showing that Gemini 3 Flash achieved the highest success rate of 95.1%. minimax-m2.1 and kimi-k2.5 ranked second and third with success rates of 93.6% and 93.4%, respectively. Claude Sonnet 4.5 scored 92.7%, while GPT-4o achieved 85.2%.

Disclaimer: The information on this page may have been obtained from third parties and does not necessarily reflect the views or opinions of KuCoin. This content is provided for general informational purposes only, without any representation or warranty of any kind, nor shall it be construed as financial or investment advice. KuCoin shall not be liable for any errors or omissions, or for any outcomes resulting from the use of this information. Investments in digital assets can be risky. Please carefully evaluate the risks of a product and your risk tolerance based on your own financial circumstances. For more information, please refer to our Terms of Use and Risk Disclosure.