Gemma 4
Gemma 4 (also tracked as Gemma 3 or Gemma4) is a language model first observed on March 9, 2026. It is priced at $0 per million input tokens and $0 per million output tokens, making hosted access free of charge. The model’s parameter count, architecture, and hardware requirements remain unconfirmed, and no official statements clarify its deployment environment or licensing terms. This zero-cost structure eliminates per-token pricing as a barrier to experimentation and sets a new baseline for free-tier language model access. The appearance of multiple naming variants has led to uncertainty about whether Gemma 3 and Gemma 4 represent distinct releases or a single model tracked under different identifiers. This entity matters now because its free pricing tier challenges prevailing commercial model economics and raises immediate questions about sustainability, performance, and Google’s broader strategy for lightweight model distribution in 2026.
Google’s Gemma 4, first observed on March 9, 2026, has become the company’s fastest-launching model, hitting 50 million downloads in weeks. Priced at $0 per million tokens, hosted access is free, but the model’s parameter count and architecture remain unconfirmed. It competes directly with LLaMA 3 and Llama 3.1 70B, and leverages MTP drafters for 3x faster inference—a technical edge over Meta’s offerings. Gemma 4 also integrates with Segment Anything Model 3.1, extending its multimodal reach. Recent tooling additions—mlx-vlm v0.6.2 and Ollama support—accelerate local deployment on GPUs and Apple Silicon. Endorsed by Ethan Mollick and tied to Android Studio, Gemma 4 is embedding itself into Google’s ecosystem. The open question: Can Google sustain this momentum without revealing the model’s full specs and hardware demands?
- ·50 million downloads in weeks; fastest Google model launch
- ·Free inference ($0/token) vs. LLaMA 3 and Llama 3.1 70B
- ·MTP drafters deliver 3x faster inference
- ·Uses Segment Anything Model 3.1 for multimodal tasks
- ·Integrated into mlx-vlm, Ollama, and Android Studio
Signal Radar
Five-axis snapshot of this entity's footprint
Mentions × Lab Attention
Weekly mentions (solid) and average article relevance (dotted)
Timeline
6- Research MilestoneApr 30, 2026
Gemma 4 hits 50 million downloads within weeks, fastest Google open model launch
View source - Product LaunchApr 15, 2026
Was integrated by a developer to replace an entire dash cam video analysis stack.
- Product LaunchApr 5, 2026
Community developer ported Gemma 4 to MLX-Swift, enabling local inference on Apple Silicon via LocallyAI app.
View source - Research MilestoneApr 3, 2026
Gemma 4 model demonstrated self-terminating loop detection during a coding task, an emergent behavior for execution control.
View source - Research MilestoneApr 3, 2026
Independent analysis declares Gemma4 models as best-in-class for small open LLMs.
View source- assessment:
- Superior model behavior
Relationships
11Developed
Uses
Competes With
Endorsed
Frequently appears with
4Entities that show up in the same articles — shared coverage, not a stated relationship.
Predictions
1- incorrectmonthApr 6, 2026
Google will ship a Gemini 3.x on-device/consumer-hardware release within 2 weeks
Gemma 4 is now surging and the live web context shows Google positioning it explicitly for phones, consumer GPUs, and agentic workflows. The graph cascade from Gemma 4 to Gemini 3.1 and Gemini 3 Deep Think suggests Google is using Gemma as the open-model proving ground before a Gemini-branded follow-on release.
58%
AI Discoveries
1- observationactiveJun 11, 2026
Lifecycle: Gemma 4
Gemma 4 is in 'declining' phase (0 mentions/3d, 1/14d, 20 total)
90% confidence
Sentiment History
| Week | Avg Sentiment | Mentions |
|---|---|---|
| 2026-W20 | 0.30 | 1 |
| 2026-W21 | 0.60 | 2 |
| 2026-W23 | 0.50 | 1 |