The AI assistant is moving from "just talking" to "doing things for you".

Recently, Google has officially launched the Gemini task automation feature on Pixel 10 Pro and Galaxy S26 Ultra. This marks the official evolution of AI assistants from "conversation tools" to "execution agents (Agents)", not only answering questions but also taking over the phone screen to handle tasks for you.

Test Experience: Fully "Driverless", But Requires Some Patience

In a test case disclosed by OSCHINA, if you want to order a DoorDash delivery, just give a natural language instruction to Gemini, and the subsequent operations can be described as "science fiction":

Background Operations: The AI will automatically open the app, identify interface elements, fill out forms, select options, and confirm the order.

Asynchronous Execution: During execution, the bottom of the screen will scroll in real-time with messages like "Selecting destination". The coolest part is that you can switch to watching videos or replying to emails, while the AI continues running in the background until the task is completed.

Speed Bottleneck: The current drawback is "slowness". Since the AI needs to recognize the interface frame by frame and perform cloud-based reasoning, a task that takes 2 minutes manually may take up to 9 minutes for the AI.

Technical Breakthrough: Breaking the Ten-Year Ceiling of "Information Query"

Over the past decade, from Siri to Google Assistant, voice assistants have always remained at the shallow level of setting alarms and checking the weather. The core breakthrough of Gemini task automation lies in its ability to plan complex, long-chain tasks, making "giving commands - waiting for results" possible.

Ecological Limitations: Still in the "Concept Product" Stage

Although the prospects are promising, the current automation features still face many challenges:

Narrow Adaptation Scope: Currently, it only supports highly standardized apps such as Uber and DoorDash.

Need for Improved Error Tolerance: Interface recognition errors or security restrictions in the payment process remain major obstacles to its widespread adoption.

Major Players' Battle: The Year of "AI Agent" Begins in 2026

With the recent efforts of OpenAI's Operator and Apple's Apple Intelligence, Google has taken the initiative to enter the mobile market, aiming to leverage the Android ecosystem to capture frequent life scenarios.

Although the current Gemini task automation seems a bit "clumsy", technological progress often follows an exponential curve. When AI can smoothly operate any app at human speed, our way of interacting with phones will be completely rewritten. This "slow but cool" evolution is a crucial step towards general artificial intelligence (AGI).