This is how Apple built 'a Siri that’s profoundly more capable' — and yes, it was done with Google and Nvidia's help

Apple WWDC 2026 Siri architecture deep dive
(Image credit: Lance Ulanoff / Future)

When Apple talks about how it used Google's Gemini foundation models to build the all-new Siri, without using the Gemini app, it can start to sound like semantics. But a deep dive with the team that built the Siri we were promised almost two years ago quickly disabuses you of that notion.

"This is the amount of the Google assistant we use, which is none," said Apple's senior vice president of Software Engineering, Craig Federighi, on Monday, just hours after Apple finally unveiled the Siri we'd been promised two years ago during Monday's WWDC 2026 Keynote.

Wearing his trademark tight blue dress shirt, Federighi sat alongside Sebastien Marineau, VP Software at Apple, Amar Subramanya, VP, AI, at Apple, and Apple's VP of engineering, Mike Rockwell, on the small Developer Center stage, a relatively intimate setting compared to the vast outdoor Keynote venue situated just outside the vast Apple Park ring.

Latest Videos From

It was in this darkened hall, with outgoing Apple CEO Tim Cook and his successor, John Ternus, looking on from front-row seats, that Federighi and company dug into the thorny architectural details of building a more personable, contextual, and deeply integrated Siri that spans the Apple ecosystem. They were, in a way, celebrating the late delivery of a promise but also reckoning with the reality of what the tumultuous past 24 months have wrought.

Apple WWDC 2026 Siri architecture deep dive

Left to right, Amar Subramanya, Mike Rockwell, Sebastien Marineau, and Craig Federighi. (Image credit: Lance Ulanoff / Future)

From a macro level, Siri is now a vast and complex system that includes one very powerful local, multi-model model and a series of even more powerful cloud-based ones that all live in some versions of Apple's Private Compute Cloud.

The models feature names like AFM Core, AFM Cloud Pro, and ADM Cloud Images. "Every model is a significant leap based on quality and operation compared to previous generation models," said Subramanya.

I was inclined to agree after seeing demos both during the architecture talk and later during one-on-one demos. Think of Siri AI and the Siri App as Siri unleashed.

Siri reborn

Apple WWDC 2026 Siri architecture deep dive

(Image credit: Lance Ulanoff / Future)

It has, it appears, full knowledge of your first-party Apple app capabilities and can quickly make the leap from a query in one app to the contextual information sucked right out of, say, Messages. It appears to know that the image of a month's worth of planned soccer games you just opened on your desktop is a schedule that it can add to your calendar.

It sees images on the desktop and through the camera. It remembers the context of a conversation and uses a more convincing voice to guide you through the most complex tasks. In a word, this Siri seems smart.

But Apple would not have gotten here without Google, and, it turns out, Nvidia.

Just how involved was Google? Apple makes no secret of its use of Google Gemini foundation models, but the scope of its involvement was thrown into stark relief by a schematic Federigi used to explain the inner workings of Siri's architecture.

A model collaboration

Apple WWDC 2026 Siri architecture deep dive

(Image credit: Lance Ulanoff / Future)

As you can see, there are boxes for all the new models and system components; all of them are color-coded, but with just two different colors: solid blue for Apple's own builds, and a sort of mix of blue and white for Apple and Google co-developed models. Every single model is co-developed. Apple's solo work is largely in what sits over all of this.

Here's how Apple explained the clockwork to us. The system starts with, naturally, speech recognition, which produces the query text. After that, it's the job of the all-important System Orchestrator to build a prompt and send it to the foundation models. It's also at this stage that Apple's system decides if the query will be handled within the large, 20 billion parameter AFM Core Advanced model (up from 3 billion on the current Siri model) or be sent to Apple's Private Cloud compute and one of the larger models, which includes AFM Cloud, AFM Cloud Pro, and ADM Cloud (for a diffiusion model for image generation).

Apple WWDC 2026 Siri architecture deep dive

(Image credit: Lance Ulanoff / Future)

A smarter way of parsing parameters

One of the big innovations here, and why Apple can have such a vastly large model on your iPhone, is in how it handles parameters. Normally, because each query can have many different requests and require a variety of parameters, all those parameters are loaded into memory at once to meet the demands. It's a huge strain on memory and battery life and, with 20 billion parameters on Apple's AFM Core Advanced model, simply not practical. So they built something called a "scarce model."

"Unlike the server models, what core advance does is it looks at the entire request, chooses the right set of parameters, and then locks them in for the entire request. And so you're not having to reload parameters with every token and this dramatically cuts down the cost of loading these parameters," said Subramanya.

Even though these models are co-built with the latest Gemini models and will be updated with future Google Foundation Model work, at no point in that pathway is Google Gemini taking the wheel.

Instead, Apple took the same approach it's taken for most of its innovation partnerships. It identifies the best-in-class component or technology and then has the partner build a bespoke version. In this case, the collaboration is, perhaps, richer, since Apple is co-building these models, but its interest in Google's AI capabilities stops short of the app client.

The customer experience is and should feel completely Apple.

Apple WWDC 2026 Siri architecture deep dive

(Image credit: Lance Ulanoff / Future)

Apple, Google and Nvidia, perfect together

The back end, or cloud side, is a far more collaborative effort than you might expect from Apple. For a company that's built its name on privacy and security, it's been forced to work with third-party partners to wrench their cloud offerings into secure spaces that satisfy both Apple and its customers' demands and expectations of privacy.

The idea of Private Cloud Compute (PPC), originally introduced with Apple Intelligence in 2024, is a cloud space big enough to accommodate models too large for on-device computation, while also replicating the privacy structure found on local devices. That's easier to do when you control all the servers, but in the new world of Siri AI, Apple has opened up PPC to Google and a new Apple Intelligence partner, Nvidia.

To run far more powerful models like AFM Cloud Pro, Apple needed "the latest technology from NVIDIA, and so we set out to extend private cloud compute to third-party cloud," explained Subramanya.

Nvidia was already working on something it called confidential compute, but it didn't meet Apple's stringent PPC criteria. "We set out to design this with Google as a collaboration," said Subramany. The solution comprises, in part, Nvidia GPUs and redundant security components from Intel and Google.

Apple WWDC 2026 Siri architecture deep dive

(Image credit: Lance Ulanoff / Future)

The moment of truth

In essence, Apple's Private Cloud Compute now lives on Nvidia and Google servers, but Apple execs insist, "Apple devices can only talk to software signed by Apple," meaning that if these systems do not have software signed and verified by Apple, Siri won't connect with them.

This is unquestionably a vastly different Siri than the one you might be using on your iPhone 17 Pro today, but it's also quite similar to what Apple demonstrated but did not deliver in 2024 or 2025. Federighi and company didn't rehash all the hurdles and false starts of the past 24 months, but VP of Engineering Mike Rockwell did offer a rare glimpse into what was clearly a pivotal moment.

"Last year, we had actually built a first version of this that was sort of incremental on top of the original Siri...and we had it working, but we didn't feel it was really delivering on the vision and the experience that we wanted to do, and so we also had a design which required much more extensive changes. And we decided to go with that. And so we went back, and we rebuilt Siri from the ground up," said Rockwell.

What's not clear from this is if this was the moment Apple realized it couldn't go it alone, it needed Google and its powerful Gemini models to fulfill its vision, but without somehow letting the Gemini experience take over.

Siri AI is that successful melding of Apple's original vision for artificial intelligence with, perhaps, the best generative models in the business. And like all the best consumer software experiences, you don't have to know how the sausage is made, just that it works exactly as Apple promised and you want it to.


Google logo on a black background next to text reading 'Click to follow TechRadar'

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds.



TOPICS
Lance Ulanoff
Editor At Large

A 38-year industry veteran and award-winning journalist, Lance has covered technology since PCs were the size of suitcases and “on line” meant “waiting.” He’s a former Lifewire Editor-in-Chief, Mashable Editor-in-Chief, and, before that, Editor in Chief of PCMag.com and Senior Vice President of Content for Ziff Davis, Inc. He also wrote a popular, weekly tech column for Medium called The Upgrade.


Lance Ulanoff makes frequent appearances on national, international, and local news programs including Live with Kelly and Mark, the Today Show, Good Morning America, CNBC, CNN, and the BBC. 

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.