Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

Apple engineer working on Siri code on a MacBook, with a glowing AI model diagram on screen showing 1.2T parameters

Apple Using Custom 1.2T-Parameter Google Model for Siri Overhaul

Apple using custom 1.2T-parameter Google model for Siri, per Reuters. Model larger than Gemini 3.5 Flash's 300B parameters; simple queries run locally.

AAAla SMITH & AI Research Desk·May 25, 2026·3 min read··183 views·AI-Generated·Report error

Source: x.comvia @kimmonismusSingle Source

What model is Apple using for the next Siri overhaul?

Apple is reportedly using a custom 1.2T-parameter Google model for Siri, per Reuters. The model is significantly larger than Gemini 3.5 Flash's estimated 300B parameters, with simple queries expected to run locally.

TL;DR

Apple using 1.2T-parameter Google model for Siri. · Model larger than Gemini 3.5 Flash's 300B parameters. · Simple queries expected to run locally on device.

Apple is reportedly using a custom 1.2T-parameter Google model for Siri, per Reuters. The model, significantly larger than Gemini 3.5 Flash's 300B parameters, will power parts of the next Siri overhaul.

Key facts

Apple using custom 1.2T-parameter Google model for Siri.
Gemini 3.5 Flash estimated at 300 billion parameters.
Simple queries expected to run locally on device.
Reported by Reuters via @kimmonismus.
Next Siri overhaul expected at WWDC 2026.

Apple is not merely adding Gemini to Siri—it is reportedly using a custom 1.2T-parameter Google model as the brain behind parts of the next Siri overhaul, according to Reuters. This model is substantially larger than Gemini 3.5 Flash, which is estimated to have around 300 billion parameters.

Key Takeaways

Apple using custom 1.2T-parameter Google model for Siri, per Reuters.
Model larger than Gemini 3.5 Flash's 300B parameters; simple queries run locally.

The Size vs. Speed Trade-Off

Apple to Use Google Gemini AI Model for Updated Siri in 2026 | Ukraine ...

The 1.2T parameter count raises immediate questions about performance and latency. Apple's model must deliver answers to everyday queries quickly and be fast enough while doing so. Simple queries are expected to run locally on the device, which would require efficient on-device inference—a non-trivial challenge for a model of this scale.

Unique Take: Apple's Strategic Bet on Third-Party Models

The unique angle here is not just that Apple is using a Google model, but that it is deploying a custom, massive model for a consumer-facing assistant. This marks a departure from Apple's historical preference for smaller, on-device models like the 3B-parameter models used in earlier Apple Intelligence features. The 1.2T parameter count suggests Apple is prioritizing capability over latency, at least for server-side queries, and betting that Google's architecture can deliver both speed and accuracy.

Implications for the Assistant Market

Apple Tests Large Language Models Up t…

This move positions Siri to compete more aggressively with standalone AI assistants like ChatGPT and Claude. The custom Google model could give Siri a significant edge in tasks requiring deep reasoning or broad knowledge, while local handling of simple queries preserves privacy and responsiveness. However, the success hinges on whether the model can run fast enough for real-time interaction—a known pain point for large models in production.

What's Next

Apple is expected to unveil more details at WWDC, likely in June 2026. The next months will also bring GPT-5.6, Sonnet 4.8/Opus 4.8, and Gemini 3.5 Pro, creating a competitive landscape for assistant technology.

What to watch

Watch for WWDC 2026 in June for official details on Siri's capabilities and latency benchmarks. Also track whether Apple discloses the model's performance on standard assistant benchmarks like MMLU or GSM8K.

Sources cited in this article

Reuters. The
Reuters. This
Reuters.

Source: gentic.news · May 25, 2026 · author=Ala SMITH · citation.json

AI-assisted reporting. Generated by gentic.news from 3 verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

Apple's reported use of a 1.2T-parameter Google model for Siri represents a strategic shift from its typical on-device, privacy-first approach. The scale suggests Apple is willing to trade some latency for capability, at least for server-side queries. This is notable given that Apple's previous AI initiatives, like Apple Intelligence, used smaller models (around 3B parameters) optimized for on-device inference. The move also signals that Apple sees third-party models as a viable path to catching up with OpenAI and Anthropic, rather than building its own foundation models from scratch. However, the 1.2T parameter count raises practical questions. Running a model of this size for real-time queries is expensive and may introduce latency issues. Apple's historical strength in hardware-software integration could help—its Neural Engine and custom silicon are designed for efficient inference. But the model's performance on standard benchmarks like MMLU or GSM8K remains unknown, and speed will be critical for user adoption. This is a high-risk, high-reward bet. If successful, it could make Siri a serious competitor in the assistant market. If not, Apple may stick with smaller models or switch to a different partner. The next few months, leading up to WWDC, will be telling.

#siri #apple #ai models #google #assistants

Compare side-by-side

Google vs Apple

→

Mentioned in this article

Apple Google Siri Gemini 3 Flash

Enjoyed this article?

Get the weekly AI intelligence briefing

✨AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

Products & Launches3 shared topics

Gemini 3.5 Flash Scores 78.4 on OSWorld, Matching GPT-5.5

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

Original research · EUMAS 2026

MNEMA — A Witness Lattice for Multi-Agent AI Memory

Cryptographic memory units · 1−α detection floor · 15 pp PDF

Field framework · v1.0

Epistemic Infrastructure

12 pillars · 11-stage knowledge metabolism · pathology catalog

Apple Using Custom 1.2T-Parameter Google Model for Siri Overhaul

Key Takeaways

The Size vs. Speed Trade-Off

Unique Take: Apple's Strategic Bet on Third-Party Models

Implications for the Assistant Market

What's Next

What to watch

Sources cited in this article

AI Analysis

✨AI Toolslive

Related Articles

Apple’s New Siri in Camera Adds Visual Intelligence to iPhone

Amazon Designs Custom AI Silicon for Future Devices, Panay Says

Apple Ditches Apple Silicon Pledge, Routes AI Queries to Google Cloud

OSWorld 2.0 Launches, Tests AI Agents on 1,500 Desktop Tasks

Gemini 3.5 Flash Scores 78.4 on OSWorld, Matching GPT-5.5

The framework underneath this story

More in Products & Launches

Google DeepMind adds async agents, MCP support to Gemini API

OpenAI GPT-5.6 Launches Thursday After US Gov't Lifts Ban

Cohere Releases Arabic Speech Recognition Model Under Apache 2.0