A tiny startup has helped Intel trounce AMD and Nvidia in critical AI tests — is it game over already?

A profile of a human brain against a digital background.
Image credit: geralt on Pixabay (Image credit: Pixabay)

Numenta has demonstrated that Intel Xeon CPUs can vastly outperform the best CPUs and best GPUs on AI workloads by applying a novel approach to them.

Using a set of techniques based on this idea, branded under the Numenta Platform for Intelligent Computing (NuPIC) label, the startup has unlocked new performance levels in conventional CPUs on AI inference, according to Serve the Home.

The really astonishing thing is it can apparently outperform GPUs and CPUs specifically designed to tackle AI inference. For example, Numenta took a workload for which Nvidia reported performance figures with its A100 GPU, and ran it on an augmented 48-core 4th-Gen Sapphire Rapids CPU. In all scenarios, it was faster than Nvidia’s chip based on total throughput. In fact, it was 64 times faster than a 3rd-Gen Intel Xeon processor and ten times faster than the A100 GPU.

Boosting AI performance with neuroscience

Numenta, known for its neuroscience-inspired approach to AI workloads, leans heavily on the idea of sparse computing – which is how the brain forms connections between neurons. 

Most CPUs and GPUs today are designed for dense computing, especially for AI, which is rather more brute force than the contextual manner in which the brain works. Although sparsity is a surefire way to boost performance, CPUs can’t work well in that way. This is where Numenta steps in. 

This startup looks to unlock the efficiency gains of sparse computing in AI models by applying its “secret sauce” to general CPUs rather than chips built specifically to handle AI-centric workloads

Although it can work on both CPUs and GPUs, Numenta adopted Intel Xeon CPUs and applied its Advanced Vector Extensions (AVX)-512 plus Advanced Matrix Extensions (AMX) to it, because Intel’s chips were the most available at the time. 

These are extensions to the x86 architecture – serving as additional instruction sets that can allow CPUs to perform more demanding functions. 

Numenta delivers its NuPIC service using docker containers, and it can run on a company’s own servers. Should it work in practice, it would be an optimum solution to repurposing CPUs already deployed in data centers for AI workloads, especially in light of lengthy wait times on Nvidia’s industry-leading A100 and H100 GPUs. 

More from TechRadar Pro

Keumars Afifi-Sabet
Channel Editor (Technology), Live Science

Keumars Afifi-Sabet is the Technology Editor for Live Science. He has written for a variety of publications including ITPro, The Week Digital and ComputerActive. He has worked as a technology journalist for more than five years, having previously held the role of features editor with ITPro. In his previous role, he oversaw the commissioning and publishing of long form in areas including AI, cyber security, cloud computing and digital transformation.