ChainThink reports that on May 20, Google will unveil its next-generation lightweight model, Gemini 3.2 Flash, at the I/O conference. According to Abacus.AI CEO Bindu Reddy, the model achieves 92% of GPT-5.5’s performance on coding and reasoning tasks, with inference costs just one-fifteenth to one-twentieth of GPT-5.5’s, and most query latencies under 200 milliseconds. Its overall performance is on par with GPT-5.5, but clearly falls short of Anthropic’s Mythos.
Abacus.AI CEO Bindu Reddy added that Google’s distillation plus sparsification technique compresses state-of-the-art models to Flash level without the typical performance cliff.
Gemini 3.2 Flash had previously shown signs of leakage; traces of it appeared in iOS app build packages and AI Studio metadata in early May, and it later surfaced anonymously in LM Arena evaluations. Early testers reported its strong performance in creative coding tasks, with some benchmark results surpassing those of Gemini 3.1 Pro.
