IBM has built a cost-effective AI supercomputer in its cloud

HPC
(Image credit: Shutterstock / Connect World)

IBM’s answer to the cost-effective supercomputer has already been up and running for several months now, but only recently has it disclosed any tangible information about its so-called Vela project.

Turning to its blog to discuss details, IBM revealed that the research, authored by five employees at the company, tackles the problems with previous supercomputers, and their lack of readiness for AI tasks.

In order to tweak the supercomputer model for this future type of workload, the company sheds some light on the decisions it made in terms of the use of affordable but powerful hardware.

TechRadar Pro needs you!

We want to build a better website for our readers, and we need your help! You can do your bit by filling out our survey and telling us your opinions and views about the tech industry in 2023. It will only take a few minutes and all your answers will be anonymous and confidential. Thank you again for helping us make TechRadar Pro even better.

D. Athow, Managing Editor

IBM's Vela AI supercomputer

The work highlights that “building a [traditional] supercomputer has meant bare metal nodes, high-performance networking hardware… parallel file systems, and other items usually associated with high-performance computing (HPC).” 

While it’s clear that these supercomputers can handle heavy AI workloads, including the one designed for OpenAI, the startup behind the popular ChatGPT live chat software, a lack of optimization has meant that traditional supercomputers could lack valuable power, and have an excess in other areas leading to an unnecessary spend.

While it has long been accepted that bare metal nodes are the most ideal for AI, IBM wanted to explore offering these up inside of a virtual machine (VM). The result, according to Big Blue, is huge performance gains.

“Following a significant amount of research and discovery, we devised a way to expose all of the capabilities on the node (GPUs, CPUs, networking, and storage) into the VM so that the virtualization overhead is less than 5%, which is the lowest overhead in the industry that we’re aware of.”

In terms of node design, Vela is packed with 80GB or GPU memory, 1.5TB of DRAM, and four 3.2TB NVMe storage drives.

The Next Platform estimates that, if IBM wanted to feature its supercomputer in the Top500 rankings, it would deliver around 27.9 petaflops of performance, placing it in 15th place according to November 2022’s rankings. 

While today’s supercomputers are currently able to handle AI workloads, huge developments in artificial intelligence combined with the pressing need for cost efficiency highlight the need for such a machine.

TOPICS
Craig Hale

With several years’ experience freelancing in tech and automotive circles, Craig’s specific interests lie in technology that is designed to better our lives, including AI and ML, productivity aids, and smart fitness. He is also passionate about cars and the decarbonisation of personal transportation. As an avid bargain-hunter, you can be sure that any deal Craig finds is top value!

Read more
Image of someone clicking a cloud icon.
Unified data means faster AI: Here’s how to unleash its potential
Cerebras WSE-3
DeepSeek on steroids: Cerebras embraces controversial Chinese ChatGPT rival and promises 57x faster inference speeds
DeepSeek
Nvidia out? DeepSeek pairs with banned Chinese tech giant to deliver unbelievably low pricing on AI inference which could cause Nvidia's house of cards to come crashing
Nvidia H800 GPU
A look at the unbelievable Nvidia GPU that powers DeepSeek's AI global ambition
SambaNova runs DeepSeek
Nvidia rival claims DeepSeek world record as it delivers industry-first performance with 95% fewer chips
Data center racks with cables and servers
The tipping point for AI and Managed Cloud
Latest in Pro
Finger Presses Orange Button Domain Name Registration on Black Keyboard Background. Closeup View
I visited the world’s first registered .com domain – and you won’t believe what it’s offering today
Racks of servers inside a data center.
Modernizing data centers: an efficient path forward
Dr. Peter Zhou, President of Huawei Data Storage Product Line
Why AI commonization is so important for business intelligent transformation and what Huawei’s data storage has to offer
Wix automation
The world's leading website builder aims to save businesses time with new tool
Data Breach
Thousands of healthcare records exposed online, including private patient information
China
Juniper patches security flaws which could have let hackers take over your router
Latest in News
Google Pixel 8a in aloe green showing
Google Pixel 9a benchmark link teases the performance of the upcoming mid-ranger
Quordle on a smartphone held in a hand
Quordle hints and answers for Monday, March 17 (game #1148)
NYT Strands homescreen on a mobile phone screen, on a light blue background
NYT Strands hints and answers for Monday, March 17 (game #379)
NYT Connections homescreen on a phone, on a purple background
NYT Connections hints and answers for Monday, March 17 (game #645)
Apple iPhone 16 Pro HANDS ON
Leaked iPhone 17 dummy units may have given us our best look yet at all four models
A super close up image of the Google Gemini app in the Play Store
It's official: Google Assistant will be retired for phones this year, with Gemini taking over