Google has released benchmark results for evaluating AI models in Android coding, revealing that the Gemini 3.5 Flash is the most resource-intensive model but ranks sixth overall. The benchmarks indicate that Gemini 3.5 Flash has higher latency and a 9% performance gap compared to its predecessor, Gemini 3.1 Pro Preview, despite being marketed as a faster alternative. In terms of cost, Gemini 3.5 Flash averages 355.9 tokens per benchmark run at approximately 7.1, while Gemini 3.1 Pro Preview uses only 73.3 tokens at about a third of that cost. The top-ranked models include GPT 5.5, GPT 5.4, and Gemini 3.1 Pro Preview, while Claude Opus 4.7 ranks fourth. The rankings feature both open-weight and closed-weight models, with the list remaining consistent since the last release, except for the removal of GPT 5.3 Codex.