إعلانات 14 Jun 2026

Google Launches Gemma 4: Its Most Capable Open Model Family Under Apache 2.0

Google released the open Gemma 4 family, built on Gemini 3 research under Apache 2.0, with sizes scaling from phone to server, a 256K context window, and 140+ language support.

Google has released its latest open model family, Gemma 4, which it describes as its most capable open model family to date. The models are built on Gemini 3 research and come with open weights under the permissive Apache 2.0 license, allowing any developer to download, modify, and deploy them on their own infrastructure for personal and commercial purposes with virtually no restrictions. They were unveiled during Google I/O 2026.

A Family With Scaled Sizes for Every Hardware

Gemma 4 stands out for being offered in multiple sizes suited to different runtime environments, from high-end phones to laptops and servers. The family includes small versions aimed at edge devices, named E2B and E4B, a medium 12B version, plus two larger models: a 26B model with a Mixture of Experts architecture and a 31B model with a Dense architecture. This tiering aims, according to Google, to "democratize" access to advanced AI by running it locally wherever the developer needs.

Notable Efficiency in Local Execution

Google focused on the ability to run on accessible hardware. The model's unquantized bfloat16 weights fit efficiently on a single 80GB NVIDIA H100 GPU, while quantized versions run on consumer cards to power development environments, coding assistants, and agentic workflows. The 26B Mixture of Experts model activates only 3.8 billion of its total parameters during inference, giving it high token-generation speed while maintaining quality, while the 31B Dense model focuses on maximum quality as a strong foundation for fine-tuning.

Core Capabilities

All models in the family are designed as capable reasoners with configurable thinking modes. They are multimodal, handling text and image input (with video support across all models, and audio in the smaller versions) and generating text. They support a context window of up to 256K tokens in the medium models and 128K in the small ones, plus support for more than 140 languages. They also achieved notable improvements in coding benchmarks, with native function-calling support that powers autonomous agents, and native support for the "system" role for more structured and controllable conversations.

Strong Performance Relative to Its Size

Google asserts that Gemma 4 offers strong performance relative to its size, noting that its two largest models placed in advanced positions on Arena's text leaderboard, outperforming much larger systems. The model was also added to the Android Bench leaderboard for Android development tasks, making it directly comparable to closed models in Android-specific coding and reasoning tasks.

Broad Day-One Support

Among the launch's most notable aspects is its immediate support across a wide ecosystem of tools. The models are available from day one via Hugging Face, vLLM, llama.cpp, MLX, Ollama, NVIDIA NIM, LM Studio, and others, and their weights can be downloaded from Hugging Face, Kaggle, and Ollama. They can also be tried in Google AI Studio for the larger models, and in Google AI Edge for the smaller ones, in addition to powering "Agent Mode" in Android Studio. This broad support shortens the distance between prototype and production.

What Does This Mean for Developers?

Gemma 4 represents a strong opportunity for those who want to run advanced models locally without relying on paid cloud APIs, whether for reasons of privacy, cost, or offline operation. With support for more than 140 languages and its open license, it enables building applications, tools, and agents freely, with the ability to fine-tune on private data. Measuring the quality of its performance in specific language contexts remains something worth testing in practice, especially in coding and reasoning tasks.

Share this news

Newsletter

Enjoyed this?

Subscribe and get every new article and news post straight to your inbox.

Tags: #الذكاء الاصطناعي#Google#Gemma 4#النماذج المفتوحة#التشغيل المحلّي#Gemini 3

Google Launches Gemma 4: Its Most Capable Open Model Family Under Apache 2.0

A Family With Scaled Sizes for Every Hardware

Notable Efficiency in Local Execution

Core Capabilities

Strong Performance Relative to Its Size

Broad Day-One Support

What Does This Mean for Developers?

More news

Now Inside Claude: Ask the Anthropic Economic Index About AI Usage Directly

Anthropic Launches 'Teach Claude a Skill': Record Your Screen to Teach It Your Tasks

Kimi K3: China Releases the World's Largest Open-Weight Model at 2.8 Trillion Parameters