Microsoft backed a tiny hardware startup that just launched its first AI processor that does inference without GPU or expensive HBM memory and a key Nvidia partner is collaborating with it

A Corsair One i500 on a desk
(Image credit: Future / John Loeffler)

  • Microsoft-backed startup introduces GPU-free alternatives for generative AI
  • DIMC architecture delivers an ultra-high memory bandwidth of 150 TB/s
  • Corsair supports transformers, agentic AI, and interactive video generation

d-Matrix Inc., a hardware startup based in Santa Clara, California, has introduced its first AI processor, Corsair, which is aimed at enhancing AI inference.

Backed by Microsoft and leveraging cutting-edge technology, Corsair eschews traditional GPUs and expensive high-bandwidth memory (HBM), delivering significant performance and cost benefits.

Corsair is currently available to early-access customers, with broader availability planned for the second quarter of 2025.

Corsair’s performance redefines AI inference

The Corsair processor is purpose-built to handle demanding AI inference tasks, particularly for generative AI models. For example, it achieves 60,000 tokens per second at 1 ms per token when running Llama3 8B in a single server.

In more resource-intensive scenarios, such as with Llama3 70B models, Corsair delivers 30,000 tokens per second at 2 ms per token in a single rack, translating into substantial savings in energy and operational costs compared to traditional GPU-based solutions.

The processor is built on Nighthawk and Jayhawk II tiles, using a 6nm manufacturing process. Each Nighthawk tile integrates four neural cores and a RISC-V CPU, tailored to support large-model inference with digital in-memory computation (DIMC) and versatile datatype processing, including block floating point (BFP).

Corsair adopts chiplet packaging, integrating memory and computation to maximize efficiency. It conforms to the industry-standard PCIe Gen5 full height full-length card form factor and can be paired with DMX Bridge cards for scalable performance. Each card is powered by 2400 TFLOPs of 8-bit peak computing, along with 2GB of integrated performance memory and up to 256GB of off-chip memory capacity.

It is important to note that Micron Technology, a key partner of Nvidia, is also collaborating with d-Matrix.

Initially set to launch in late 2023, d-Matrix reconfigured its architecture in response to the surging demand for generative AI. This pivot allowed Corsair to incorporate enhancements tailored for transformer models and emerging applications like agentic AI and interactive video generation.

“We saw transformers and generative AI coming, and founded d-Matrix to address inference challenges around the largest computing opportunity of our time,” said Sid Sheth, cofounder and CEO of d-Matrix.

“The first-of-its-kind Corsair compute platform brings blazing fast token generation for high interactivity applications with multiple users, making Gen AI commercially viable,” Sheth added.

Via eeNews

You may also like

TOPICS
Efosa Udinmwen
Freelance Journalist

Efosa has been writing about technology for over 7 years, initially driven by curiosity but now fueled by a strong passion for the field. He holds both a Master's and a PhD in sciences, which provided him with a solid foundation in analytical thinking. Efosa developed a keen interest in technology policy, specifically exploring the intersection of privacy, security, and politics. His research delves into how technological advancements influence regulatory frameworks and societal norms, particularly concerning data protection and cybersecurity. Upon joining TechRadar Pro, in addition to privacy and technology policy, he is also focused on B2B security products. Efosa can be contacted at this email: udinmwenefosa@gmail.com

Read more
d-Matrix Corsair card
Tech startup proposes a novel way to tackle massive LLMs using the fastest memory available to mankind
Cerebras WSE-3
DeepSeek on steroids: Cerebras embraces controversial Chinese ChatGPT rival and promises 57x faster inference speeds
Half man, half AI.
Yet another tech startup wants to topple Nvidia with 'orders of magnitude' better energy efficiency; Sagence AI bets on analog in-memory compute to deliver 666K tokens/s on Llama2-70B
Nvidia H800 GPU
A look at the unbelievable Nvidia GPU that powers DeepSeek's AI global ambition
Sam Altman and OpenAI
Nvidia, look away! OpenAI is almost ready to deliver first prototype of its AI GPU - General Processing Unit
Representation of AI
These are the 10 hottest AI hardware companies to follow in 2025
Latest in Pro
cybersecurity
What's the right type of web hosting for me?
Security padlock and circuit board to protect data
Trust in digital services around the world sees a massive drop as security worries continue
Hacker silhouette working on a laptop with North Korean flag on the background
North Korea unveils new military unit targeting AI attacks
An image of network security icons for a network encircling a digital blue earth.
US government warns agencies to make sure their backups are safe from NAKIVO security issue
Laptop computer displaying logo of WordPress, a free and open-source content management system (CMS)
This top WordPress plugin could be hiding a worrying security flaw, so be on your guard
construction
Building in the digital age: why construction’s future depends on scaling jobsite intelligence
Latest in News
Ray-Ban Meta Smart Glasses
Samsung's rumored smart specs may be launching before the end of 2025
Apple iPhone 16 Review
The latest iPhone 18 leak hints at a major chipset upgrade for all four models
Quordle on a smartphone held in a hand
Quordle hints and answers for Monday, March 24 (game #1155)
NYT Strands homescreen on a mobile phone screen, on a light blue background
NYT Strands hints and answers for Monday, March 24 (game #386)
NYT Connections homescreen on a phone, on a purple background
NYT Connections hints and answers for Monday, March 24 (game #652)
Quordle on a smartphone held in a hand
Quordle hints and answers for Sunday, March 23 (game #1154)