Aliyun PAI open-sources the AgenticQwen small model with dual data flywheel training.

AIMPACT Update, April 27 (UTC+8): According to monitoring by Beating, Alibaba’s PAI team has released and open-sourced AgenticQwen, a lightweight agent language model designed for industrial-grade tool invocation, available in two versions: 8B and 30B-A3B. Trained using an innovative "dual data flywheel" reinforcement learning framework, this model series achieves agent capabilities approaching those of hundred-billion-parameter large models while significantly reducing inference costs. The core mechanism lies in its "dual data flywheel" training method. Traditional synthetic data often suffers from homogenization, leading to performance plateaus. AgenticQwen addresses this with two flywheels: the reasoning flywheel automatically generates harder variants from the model’s mistakes; the agent flywheel expands simple linear workflows (e.g., single booking tasks) into multi-branch behavior trees incorporating constraints, rejections, and adversarial conditions, simulating real-world complex decision-making scenarios. Evaluations show that AgenticQwen-8B achieves an average score of 47.4 on real-world tool benchmarks such as TAU-2 and BFCL-V4, far surpassing the base Qwen3-8B (23.8) and nearing Qwen3-235B (52.0). AgenticQwen-30B-A3B (activating only 3B parameters) scores 50.2. The model has already been deployed in internal production systems similar to Manus, substantially narrowing the performance gap with the 235B model (with shorter end-to-end inference times). However, the paper acknowledges that, due to its native 40K context length limitation, smaller models still face challenges in deep search tasks. (Source: BlockBeats)