Google Advances AI Agents with Gemini 2.5 Computer Use Model to Enhance Web Automation

Google's Gemini 2.5 Computer Use model is a cutting-edge AI designed to autonomously navigate and interact with web browsers like a human, performing complex tasks such as clicking, typing, scrolling, and form filling. Available via API, it outperforms competitors on web and mobile tasks with high accuracy and low latency, revolutionizing web automation

Google Advances AI Agents with Gemini 2.5 Computer Use Model to Enhance Web Automation

Alphabet Inc. has unveiled its latest milestone in artificial intelligence with the launch of the Gemini 2.5 Computer Use model, a specialized AI system designed to autonomously navigate and interact with web browsers and digital interfaces as if operated by a human. Released on October 7, 2025, this new model sets a benchmark in the evolving field of AI agents, offering a sophisticated approach to online automation and significantly challenging competitors like OpenAI and Anthropic.

The Gemini 2.5 Computer Use model distinguishes itself by operating through visual understanding and reasoning, enabling it to execute complex web-based tasks such as clicking buttons, typing text, scrolling pages, and filling out forms. Unlike conventional automation tools that rely on fixed APIs, Google's model interacts directly with graphical user interfaces, allowing it to adapt to dynamic websites and swiftly changing digital environments. It supports thirteen distinct actions, including drag-and-drop functions relevant for many web applications.

Strategically, Google timed this announcement to closely follow OpenAI's recent ChatGPT Agent updates and build upon Anthropic’s earlier computer use AI capabilities. While some competitor models offer comprehensive desktop control, Google's approach carefully focuses on browser-specific actions, delivering superior performance with lower latency. In benchmark tests such as Online-Mind2Web and WebVoyager, the Gemini 2.5 Computer Use model demonstrated significant accuracy and operational advantages—achieving nearly 77% accuracy on Online-Mind2Web and 79.9% on WebVoyager, outperforming competitor models by large margins.

The model is already integrated into several Google products, including Project Mariner and AI Mode in Google Search, where internal users report it resolving over 60% of previously unfixable issues, drastically reducing time and effort in troubleshooting.

To support developers, Google offers the Gemini 2.5 Computer Use model through its AI Studio and Vertex AI platform under a paid token pricing structure. Input tokens cost $1.25 per million for prompts under 200,000 tokens, marking a shift from the standard Gemini model's free tier by requiring payment for use. This launch further intensifies the AI agent market competition, which doubled in value to an estimated $7.38 billion by 2025, as companies race to embed intelligent automation across workflows.

Google leverages its extensive ecosystem—including Search, Android, YouTube, and Workspace—to create a compelling advantage in market adoption, with over 2.3 billion interactions inside Google Workspace alone in early 2025. Safety and security remain paramount; Google enforces multi-layered safeguards, including action-level safety reviews, developer controls to prevent abuse, and mandatory user confirmation for sensitive commands such as purchases, ensuring that AI-driven automation operates safely and reliably.

With the Gemini 2.5 Computer Use model, Google is advancing the frontier of AI agents, setting new standards for web automation. This innovation promises to transform not only consumer-facing applications but also enterprise workflows, improving efficiency and interaction capabilities on the internet for millions worldwide.

This development marks an important step in the AI evolution race, showcasing Google’s commitment to pioneering safe, effective, and versatile AI tools for the future of digital interaction.