ME News reports that on June 15 (UTC+8), according to monitoring by Beating, AI agent developer WecoAI released evaluation results of seven cutting-edge large models on autonomous scientific research tasks. In the Machine Learning Engineering task, Kimi-K2.7-Code, the newly open-sourced trillion-parameter model from Moonshot AI, outperformed all other leading large models tested, including Anthropic’s flagship model Fable-5. The evaluation used a cost-constrained protocol—rather than step-limited—meaning that under a fixed budget, models with lower per-use costs could perform more attempts and iterations. Overall, while Fable-5 dominated the test suite in Prompt Engineering and Algorithm Discovery tasks and claimed the overall championship, its performance in Machine Learning Engineering lagged even behind its predecessor, Claude 3 Opus. The evaluation team suggested that Fable-5’s weaker performance in this task may be due to its high API costs placing it at a disadvantage under cost constraints, or because the task triggered more stringent safety guardrails within the model. (Source: BlockBeats)
Kimi-K2.7-Code Outperforms Fable-5 in ML Engineering Tasks
KuCoinFlashShare
AI and crypto news broke on June 15 (UTC+8) as WecoAI released benchmark results for top models in ML engineering. Kimi-K2.7-Code, a trillion-parameter model from Moonshot AI, ranked first, outperforming Fable-5. The test employed a cost-constrained protocol, allowing lower-cost models more iterations. Fable-5 trailed even behind Claude 3 Opus, likely due to high API costs or stricter safety restrictions. A protocol update in the testing methodology prioritized cost efficiency over step limits.
Source:Show original
Disclaimer: The information on this page may have been obtained from third parties and does not necessarily reflect the views or opinions of KuCoin. This content is provided for general informational purposes only, without any representation or warranty of any kind, nor shall it be construed as financial or investment advice. KuCoin shall not be liable for any errors or omissions, or for any outcomes resulting from the use of this information.
Investments in digital assets can be risky. Please carefully evaluate the risks of a product and your risk tolerance based on your own financial circumstances. For more information, please refer to our Terms of Use and Risk Disclosure.