Next-generation HBF memory will feed AI accelerators faster than ever, changing how GPUs handle massive datasets efficiently

Professor Kim Jeong-ho of the Department of Electrical and Electronic Engineering at the Korea Advanced Institute of Science and Technology (KAIST) is giving a presentation at the Small and Medium Business Future Forum held in Yangjae-dong, Seocho-gu — (Image credit: Photo = Reporter Go Myeong-hun)

HBF offers ten times HBM capacity while remaining slower than DRAM
GPUs will access larger data sets through tiered HBM-HBF memory
Writes on HBF are limited, requiring software to focus on reads

The explosion of AI workloads has placed unprecedented pressure on memory systems, forcing companies to rethink how they deliver data to accelerators.

High-bandwidth memory (HBM) has served as a fast cache for GPUs, allowing AI tools to read and process key-value (KV) data efficiently.

However, HBM is expensive, fast, and limited in capacity, while high-bandwidth flash (HBF) offers much larger volume at slower speeds.

How HBF complements HBM

HBF’s design allows GPUs to access a wider data set while limiting the number of writes, roughly 100,000 per module, which requires software to prioritize reads over writes.

HBF will integrate alongside HBM near AI accelerators, forming a tiered memory architecture.

Professor Kim Joungho of KAIST compares HBM to a bookshelf at home for fast study, while HBF functions like a library with far more content but slower access.

“For a GPU to perform AI inference, it must read variable data called the KV cache from the HBM. Then, it interprets this and spits out word by word, and I think it will utilize the HBF for this task,” said Professor Kim.

“HBM is fast, HBF is slow, but its capacity is about 10 times larger. However, while HBF has no limit on the number of reads, it has a limit on the number of writes, about 100,000. Therefore, when OpenAI or Google write programs, they need to structure their software so that it focuses on reads.”

HBF is expected to debut with HBM6, where multiple HBM stacks interconnect in a network, increasing both bandwidth and capacity.

The concept envisions future iterations like HBM7 functioning as a “memory factory,” where data can be processed directly from HBF without detouring through traditional storage networks.

HBF stacks multiple 3D NAND dies vertically, similar to HBM stacking DRAM, and connects them with through-silicon vias (TSVs).

A single HBF unit can reach 512GB capacity and achieve up to 1.638TBps bandwidth, far exceeding standard SSD NVMe PCIe 4.0 speeds.

SK Hynix and Sandisk have demonstrated diagrams showing upper NAND layers connected through TSVs to a base logic die, forming a functional stack.

Prototype HBF chips require careful fabrication to avoid warping in the lower layers, and additional NAND stacks would further increase the complexity of the TSV connections.

Samsung Electronics and Sandisk plan to attach HBF to Nvidia, AMD, and Google AI products within the next 24 months.

SK Hynix will release a prototype later this month, while the companies are also working on standardization through a consortium.

HBF adoption is expected to accelerate in the HBM6 era, and Kioxia has already prototyped a 5TB HBF module using PCIe Gen 6 x8 at 64Gbps. Professor Kim predicts that the HBF market could surpass HBM by 2038.

Via Sisajournal (originally in Korean)

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!

And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.

Efosa has been writing about technology for over 7 years, initially driven by curiosity but now fueled by a strong passion for the field. He holds both a Master's and a PhD in sciences, which provided him with a solid foundation in analytical thinking.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.

Samsung and Sandisk are set to integrate rival HBF technology into AI products from Nvidia, AMD, and Google within 24 months, and that's a huge deal

How HBF complements HBM

Useful links