AiPhreaks ← Back to News Feed

Introducing computer use in Gemini 3.5 Flash

By Jakub Antkiewicz

2026-06-25T10:41:49Z

Google Integrates 'Computer Use' into Gemini 3.5 Flash for Agentic AI

Google DeepMind has integrated 'computer use' capabilities directly into its Gemini 3.5 Flash model, enabling it to build agents that can interact with graphical user interfaces across various platforms. The feature allows developers to create custom agents that can see, reason, and take action within browser, mobile, and desktop environments. This development is significant as it shifts the model's functionality from API-based function calling to direct UI manipulation, addressing a key requirement for sophisticated enterprise automation.

Technical Specifications and Availability

Previously offered as a standalone model, the computer use tool is now natively supported in the main Gemini 3.5 Flash model, a move intended to improve performance and simplify the development stack. Google has made this available to developers and enterprises through the Gemini API and its Gemini Enterprise Agent Platform. To address safety concerns associated with agents operating in live environments, the company has implemented several safeguards.

  • Native Integration: Computer use is now a built-in capability of Gemini 3.5 Flash.
  • Target Applications: Designed for long-horizon tasks such as continuous software testing and knowledge work automation across multiple professional applications.
  • Safety Protocols: Includes targeted adversarial training against prompt injection, with optional enterprise safeguards for requiring user confirmation on sensitive actions and automatically halting tasks if an injection is detected.
  • Developer Access: Available via the Gemini API and the Gemini Enterprise Agent Platform.

Impact on Enterprise Automation

This integration positions Google more competitively in the growing market for AI-driven automation and agentic systems. By embedding UI interaction into a core, efficient model like 3.5 Flash, the company lowers the technical barrier for creating complex workflows that span multiple applications. The move indicates a clear market direction toward AI that not only processes information but also performs actions within existing enterprise software, potentially affecting the traditional robotic process automation (RPA) landscape.

By embedding UI control directly into its flagship lightweight model, Google is making a strategic push from pure language generation toward integrated, action-oriented AI agents, aiming to lower the barrier for developers building complex enterprise automation.
End of Transmission
Scan All Nodes Access Archive