Taiwanese startup shocks AI industry with low-power accelerator using outdated chips and massive 700 billion parameter support

Skymizer HTX301 GPU — (Image credit: Wccftech)

Skymizer claims giant AI models no longer need hyperscale GPU infrastructure
Old 28nm chips suddenly power massive language models at surprisingly low wattage
The HTX301 squeezes 384 GB of memory into a single PCIe accelerator card

A Taiwanese company called Skymizer has unveiled a PCIe AI accelerator that challenges both AMD and Nvidia using surprisingly old technology.

The HTX301 card can run language models with up to 700 billion parameters on a single device while consuming only 240 watts of power.

The card achieves this feat using older 28-nanometer chips and standard LPDDR4 and LPDDR5 memory instead of expensive HBM or GDDR solutions.

Old tech chip competes with modern AI accelerators

Skymizer claims its card delivers 30 tokens per second with just 0.5 TOPS at 100 GB per second bandwidth.

The HTX301 is built on Skymizer's HyperThought platform, which features next-generation LPU IP designed specifically for large language model workloads.

Each PCIe card contains six HTX301 chips working together, and the card offers up to 384 GB of total memory capacity.

The design uses efficient compression techniques for both weights and KV cache, outperforming open source llama.cpp by 9 to 17.8 percent.

Its power consumption sits at less than half of what leading PCIe AI accelerators from AMD and NVIDIA typically require.

The card supports agentic AI for coding, automation, and domain-specific workflows without needing hyperscale GPU clusters.

Running large language models in the cloud introduces privacy concerns and unpredictable costs that many organizations find unacceptable.

Upgrading on-premises infrastructure to support massive GPU accelerator platforms often requires expensive redesigns of data center power and cooling systems.

Skymizer's HTX301 offers enterprises a third option that fits into standard air-cooled servers without any infrastructure changes.

The company claims the era of needing hyperscale GPU clusters for ultra-large LLMs is over with their new technology.

The PCIe card form factor allows businesses to scale AI inference on premises while maintaining data sovereignty and predictable infrastructure costs.

Skymizer HTX301 awaits real-world testing

Skymizer will preview the HTX301 at Computex this year, allowing independent verification of its performance numbers.

The specifications of this chip look impressive on paper, but real-world testing will determine whether the card actually delivers 240 tokens per second on Llama2 7B workloads.

AMD recently launched its Instinct MI350P PCIe card with 144 GB of HBM3E memory and up to 4,600 peak TFLOPS at MXFP4 precision, yet it consumes considerably more power than Skymizer's offering.

Nvidia's RTX PRO 6000 Blackwell consumes roughly 600 watts, more than double what Skymizer's card requires for comparable inference tasks.

Should the HTX301 work as advertised, it could dramatically lower the barrier to entry for on-premises AI infrastructure.

Failure to deliver would place Skymizer among the many startups that could not back up their promises.

Via Wccftech

Google logo on a black background next to text reading 'Click to follow TechRadar'

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds.

Efosa has been writing about technology for over 7 years, initially driven by curiosity but now fueled by a strong passion for the field. He holds both a Master's and a PhD in sciences, which provided him with a solid foundation in analytical thinking.

Tiny company steals AMD's thunder and challenges Nvidia with old-tech PCIe AI accelerator that runs 700B LLMs locally, sipping just 240W thanks to decade-old DDR4 and 28nm chips

Old tech chip competes with modern AI accelerators

Skymizer HTX301 awaits real-world testing

Useful links