AiPhreaks ← Back to News Feed

Gemma 4: Byte for byte, the most capable open models

By Jakub Antkiewicz

2026-04-11T08:42:14Z

Google has released Gemma 4, its latest family of open models, emphasizing a high ratio of performance-to-parameter size. The release, which builds on the research behind the proprietary Gemini 3 models, is aimed at developers building complex reasoning and agent-based workflows. Significantly, the entire model family is now available under a commercially permissive Apache 2.0 license, a direct response to community feedback asking for fewer restrictions on use and deployment.

The Gemma 4 family consists of four models designed for different hardware targets. The Effective 2B (E2B) and 4B (E4B) models are engineered for on-device applications, featuring native audio and visual processing for mobile and IoT hardware. For more demanding tasks, Google is offering a 26B Mixture of Experts (MoE) model optimized for low-latency inference and a 31B Dense model designed for maximum quality and fine-tuning potential. The larger models support function-calling, structured JSON output, and context windows up to 256K, with performance metrics placing them among the top-ranked open models on the Arena AI leaderboard.

By adopting the Apache 2.0 license, Google is removing significant barriers to commercial adoption and positioning Gemma 4 as a direct, flexible foundation for the broader AI market. This move allows enterprises and individual developers to retain full control over their data and infrastructure. The models launched with extensive day-one support from the ecosystem, including Hugging Face, Ollama, and vLLM, alongside optimizations for hardware from NVIDIA, AMD, and Google's own TPUs, indicating a coordinated effort to ensure wide accessibility and ease of integration from edge devices to cloud infrastructure.

Google's strategy with Gemma 4 is two-pronged: deliver high-end reasoning in compute-efficient packages and remove commercial friction with an Apache 2.0 license. This positions the models not just as open alternatives, but as practical, deployable foundations for developers building agents and applications across the entire hardware spectrum, from edge devices to enterprise cloud.