Google Cloud unveils eighth-generation TPUs built to support an agentic era
Next-gen TPUs promise huge performance boosts
- Google unveils next-generation TPUs – splits off into two series, 8t and 8i
- 8t superpods can deliver 121 ExaFlops, up from 42.5 last year
- 8i delivers 3x more SRAM and increased HBM
Google Cloud has announced its eighth-generation Tensor Processing Units (TPUs) designed specifically for the agentic shift we're seeing within AI at the moment.
Revealed at Google Cloud Next 2026, the upgrades focus on longer context windows, multi-step reasoning and responsiveness at scale, thus its cloud infrastructure is being rebuilt to support persistent memory, continuous inference and multi-model workloads.
This year, we're seeing two distinct TPUs designed to support massive HBM scaling, with Google Cloud placing an emphasis on memory bandwidth as much as compute.
Article continues belowTPU 8t and 8i target trillion-parameter training in million-chip clusters
The first of two TPUs, 8t, has been optimized to be distributed across huge clusters for training foundation models. With around an 80% year-over-year improvement in performance per dollar, the company says it will train trillion-parameter models more efficiently.
Google Cloud explained that a single TPU 8t superpod can scale up to 9,600 chips, delivering 2PB of shared HBM and 121 ExaFlops of compute. For comparison, last year Ironwood was rated at up to 9,216 chips in a superpod and 42.5 ExaFlops.
Google Cloud also warned of "the latency wall" we face in an always-on agentic era, hence the launch of 8i, a second chip which serves as a post-training and inference engine.
TPU 8i sees around a 3x increase in on-chip SRAM to 384MB as well as 288GB of HBM, with pod size now up to 1,152 chips from 256, delivering 11.6 ExaFlops of performance (up from 1.2 ExaFlops).
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
As for energy and thermal efficiency, Google Cloud boasts of up to 2x better performance-per-watt over Ironwood, the predecessor.
"We['ve] innovated across hardware and software to enable our data centers to deliver six times more computing power per unit of electricity than they did just five years ago," SVP and Chief Technologist for AI and Infrastructure Amin Vahdat explained.
General availability for Google Cloud customers is expected in the coming months, and naturally, TPU 8t and TPU 8i will be at the forefront of the latest Gemini models.
The company also sees the eighth-gen hardware playing a role in developing the next frontier models by distributing training beyond a single superpod using Pathways and JAX to unlock scaling beyond one million TPU chips per any single training cluster – something execs confirmed at the event is currently entirely theoretical (but technically possible), with the TPUs yet to be made available at such a scale.
Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!
And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.
With several years’ experience freelancing in tech and automotive circles, Craig’s specific interests lie in technology that is designed to better our lives, including AI and ML, productivity aids, and smart fitness. He is also passionate about cars and the decarbonisation of personal transportation. As an avid bargain-hunter, you can be sure that any deal Craig finds is top value!
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.
