November 08, 2025 11:16 am (IST)
Follow us:
facebook-white sharing button
twitter-white sharing button
instagram-white sharing button
youtube-white sharing button
'Shift them to shelters': Supreme Court orders removal of stray dogs from public premises | Modi vs Congress erupts over 'Vande Mataram': PM says party removed crucial stanzas; Kharge hits back | Massive chaos at Delhi airport! All flights delayed after major technical glitch throws operations into disarray | Nirmala Sitharaman confirms fresh wave of bank mergers, says 'India needs a lot of big, world-class giants' | Sulakshana Pandit, front-ranking actress and playback singer of 1970s Bollywood, passes away at 71 | SBI makes big move! Plans to sell 6.3% stake in SBI Funds Management through mega IPO | Has he lost it?': Shiv Sena (UBT) leader blasts Mumbai BJP chief over ‘Khan’ remark after Mamdani’s win | Trump repeats India-Pakistan ceasefire claim, now says '7-8 planes shot down' | 'You are not Sikh': Pakistan bars Indian Hindu pilgrims from entering on Guru Nanak Jayanti, sparks outrage | 'Brazilian model voted as Seema, Sweety, Saraswati in Haryana': Rahul Gandhi drops 'H-bomb'
Google AI
Google logo. Photo: Unsplash

Google launches Gemini 2.5 Computer Use AI that can browse the web like humans

| @indiablooms | Oct 08, 2025, at 10:35 am

Google has unveiled Gemini 2.5 Computer Use, a new version of its AI model capable of navigating the web through a browser, allowing it to perform tasks much like a human user.

In a blog post, the company said the specialized model, built on Gemini 2.5 Pro’s visual understanding and reasoning capabilities, enables AI agents to interact directly with user interfaces (UIs) by clicking, typing, and scrolling.

“Today, we are releasing the Gemini 2.5 Computer Use model, our new specialized model built on Gemini 2.5 Pro’s visual understanding and reasoning capabilities that powers agents capable of interacting with user interfaces,” Google said.

According to the company, the model outperforms leading alternatives on several web and mobile control benchmarks while offering lower latency. Developers can access the new features through the Gemini API in Google AI Studio and Vertex AI.

While traditional AI systems rely on structured APIs to interface with software, Google noted that many real-world digital tasks still require direct interaction with graphical user interfaces — such as filling forms, submitting data, or navigating websites.

How it works

The Computer Use capability is integrated as a new tool within the Gemini API and operates in an iterative loop. It processes three main inputs — the user’s request, a screenshot of the environment, and a history of recent actions.

The model analyzes these inputs and generates a response, typically a function call representing a UI action like clicking or typing. Some actions, such as making a purchase, may prompt the model to request user confirmation before execution.

Once the action is executed, an updated screenshot and current URL are sent back to the model, continuing the interaction loop until the task is completed, an error occurs, or the session ends due to a safety response or user termination, Google explained.

Google said the Gemini 2.5 Computer Use model is currently optimized for web browsers, though it also shows strong potential for mobile UI control tasks. It is not yet tuned for desktop operating system-level control, the company added.

Support Our Journalism

We cannot do without you.. your contribution supports unbiased journalism

IBNS is not driven by any ism- not wokeism, not racism, not skewed secularism, not hyper right-wing or left liberal ideals, nor by any hardline religious beliefs or hyper nationalism. We want to serve you good old objective news, as they are. We do not judge or preach. We let people decide for themselves. We only try to present factual and well-sourced news.

Support objective journalism for a small contribution.