Could Google's AI Supercomputer outperform the NVidia A100 chip in speed and sustainability?

HPC
(Image credit: Shutterstock / Connect World)

A recent Google announcement has laid out how the company's supercomputers, which are used to train its artificial intelligence (AI) models, are faster and more power-efficient than comparable systems from Nvidia

Google's custom-designed Tensor Processing Unit (TPU) chip, currently in its fourth generation, is used for more than 90% of the company's AI training work, making it a crucial component of the company's technology, and Google has now published a scientific paper outlining how it has strung together more than 4,000 chips using its custom-developed optical switches to create a supercomputer.

Companies that build AI supercomputers are competing to improve the connections between the thousands of chips necessary to train large language models that power technologies like Google's Bard or OpenAI's ChatGPT. According to Google, its supercomputers make reconfiguring connections between chips on the fly easy, which can help avoid problems and improve performance.

Reconfiguring connections

In a recent blog post, Google Fellow Norm Jouppi and Google Distinguished Engineer David Patterson wrote, "Circuit switching makes it easy to route around failed components. This flexibility even allows us to change the topology of the supercomputer interconnect to accelerate the performance of an ML (machine learning) model." The company's largest publicly disclosed language model, PaLM, was trained by splitting it across two of the 4,000-chip supercomputers over a period of 50 days.

Google's chips are up to 1.7 times faster and 1.9 times more power-efficient than a system based on Nvidia's A100 chip that was on the market at the same time as the fourth-generation TPU, according to the company's scientific paper. While Google did not compare its fourth-generation TPU to Nvidia's current flagship H100 chip, the company hinted that it may be working on a new TPU that would compete with the Nvidia H100, stating that it has "a healthy pipeline of future chips."

Aloysius Valentine

Aloysius Ejike Ukejeh is a seasoned tech and virtual private network writer. He has over 5 years of experience in the technology industry, focusing on streaming, web hosting, security, and privacy. Aloysius is an expert in virtual private networks (VPNs), and he frequently writes about the latest news and developments in this area. He is also a strong advocate for online privacy and security. Aloysius spends time reading about the latest technology news in his free time.