'iPhone of AI': startup first to deliver trillion-plus parameter AI model that works in symbiosis with its very own chip — SambaNova promises 90% savings on inference costs, but take that with a pinch of salt

(Image credit: SambaNova)

Although everyone wants in, the deployment of generative AI at scale has proved a significant challenge for large enterprises and government bodies.

Despite recognizing the potential of the technology to streamline processes, reduce costs, and improve supply chains, concerns about cost, complexity, security, data privacy, model ownership, and regulatory compliance have acted as barriers to adoption.

In a potential breakthrough, Softbank-funded SambaNova Systems has announced the launch of Samba-1, the first trillion-parameter generative AI model. Powered by the SambaNova Suite, Samba-1 is designed to meet the performance, accuracy, scalability, and total cost of ownership (TCO) requirements. The model also promises a 90% reduction in inference costs, although this claim should be approached with caution.

Building the 'iPhone of AI'

Unlike other trillion-parameter models, which are built as single, monolithic entities, Samba-1 utilizes a Composition of Experts (CoE) architecture. This system aggregates multiple small "expert" models into a single large solution, functioning as a single large model. This approach offers broader knowledge across various topics, high accuracy, and multimodality.

The CoE model can also reportedly provide greater knowledge and accuracy for specialized domains than other large models. Individual smaller models can be trained for specific domains, such as finance, law, physics, or biology, and added to the CoE, bringing high accuracy for that specific domain without the need for training on the entire trillion-parameter model.

The release of Samba-1 follows SambaNova's announcement of the SN40L, a smart AI chip designed to rival those from AI behemoth Nvidia. The integration of this chip with the Samba-1 model represents a significant step forward, with SambaNova being the first to deliver an integrated hardware and software system for the enterprise.

“The entire AI industry is talking about building the iPhone of AI - an integrated hardware and software system - and SambaNova is the first to deliver a version of that to the enterprise,” said Rodrigo Liang, Co-founder and CEO of SambaNova Systems. “This past fall, we announced the SN40L, the smartest AI chip, and now we’ve integrated that chip with the first 1T parameter model for the enterprise. Samba-1 rivals GPT-4, however, it’s better suited for the enterprise as it can be delivered on-premises or in private clouds so that customers can fine-tune the model with their private data without ever disclosing it into the public domain.”

Despite the impressive capabilities of Samba-1, the model's claim to reduce inference costs by 90% should be taken with a pinch of salt. While the CoE architecture does offer low inference costs, the true value of this saving will only become apparent once the model is deployed in real-world scenarios.

Liang told us “AI is not a fad, we’re at the start of this journey. Our full-stack solution is focused on large-scale enterprise and government organizations, which no one else can provide on-prem and privately. There’s no escaping how dominant Nvidia is right now, but we’re able to deploy these models at scale for a fraction of the cost.”

More from TechRadar Pro

These are the best AI tools around today
Groq's ultrafast LPU could well be the first LLM-native processor
Nvidia's fastest AI chip ever is finally available for preorder

Wayne Williams is a freelancer writing news for TechRadar Pro. He has been writing about computers, technology, and the web for 30 years. In that time he wrote for most of the UK’s PC magazines, and launched, edited and published a number of them too.