Google DeepMind team announced a major technological breakthrough, integrating native computer usage capabilities directly into the Gemini 3.5 Flash model. This means developers can now build AI agents that autonomously view and perform actions on screens across browsers, mobile devices, and desktop computers using a single model.

Previously, this capability was only available as a separate model, requiring developers to perform complex switching and context transfer between different models. With the native integration, AI no longer needs to manually pass information when executing cross-platform long tasks, greatly simplifying the development process.

Say goodbye to context loss, directly addressing agent reliability issues

Google's team believes that the core bottleneck of AI agents is not the limit of individual tools, but the loss of context information when switching between multiple tools. By unifying search, maps, and computer operations within a single model architecture, context flows continuously, significantly reducing the probability of failure during complex tasks.

This "multi-tool integration" design is like building a comprehensive building with internal connectivity, eliminating the long and error-prone communication processes between multiple independent buildings. This architectural adjustment has the potential to bring substantial improvements in the reliability and response latency of agent-based tasks.

Focus on three core scenarios, establish multi-layered security defenses

This native capability will primarily be applied to three core scenarios, including automated tasks requiring continuous operation for hours or even days, continuous software testing for automatic user interface consistency verification, and knowledge-intensive work across applications. These scenarios highly rely on the continuity of context between multiple tasks and can effectively replace humans in performing repetitive and high-energy-consuming operations.

In terms of security design, Google has adopted a multi-layered defense strategy, including targeted adversarial training, enterprise security safeguards for sensitive operations, and indirect prompt injection detection. These mechanisms will collectively help enterprise users establish a relatively complete security boundary in open and uncontrollable computer environments.