Google has introduced Android Bench, a tool for assessing AI model performance in Android app development. The top performer is Gemini 3.1 Pro, scoring 72.2%, followed by Claude Opus 4.6 at 66.6% and GPT 5.2 Codex at 62.5%. The benchmark evaluates models through real-world Android coding challenges with task completion rates between 16% and 72%. Google aims to facilitate the creation of Android applications from user prompts and has made the benchmark's methodology and tools available on GitHub.