Cerebras Systems, makers of the world’s largest chip (opens in new tab), has announced that its CS-2 system now supports PyTorch and TensorFlow which will make it possible for researchers to quickly and easily train models with billions of parameters.
The company’s CS-2 is the world’s fastest AI (opens in new tab) system and is powered by its Wafer-Scale Engine 2 (WSE-2 (opens in new tab)) CPU. With the release of version 1.2 of the Cerebras Software Platform (CSoft), the CS-2 now supports additional machine learning (opens in new tab) frameworks which will give developers even more choice when it comes to the types of models they want to run.
Senior director of AI framework at Cerebras Systems, Emad Barsoum provided further insight in a press release (opens in new tab) on how CSoft now enables developers to express models written in either TensorFlow or PyTorch, saying:
“From the start, our goal was to seamlessly support whichever machine learning framework our customers wanted to write in. Our customers write in TensorFlow and in PyTorch, and our software stack, CSoft, makes it quick and easy to express your models in the framework of your choice. By doing so, our customers gain access to the 850,000 AI optimized cores and 40 Gigabytes of on-chip memory in the Cerebras CS-2.”
Scaling large language models
CSoft version 1.2 now enables developers to write their models in the open source frameworks of PyTorch (opens in new tab) or TensorFlow (opens in new tab) and run them on the Cerebras CS-2 without any modification whatsoever. At the same time, an AI model written for a GPU or CPU can run in CSoft on the CS-2 without any changes.
With the combined power of CS-2 and CSoft, developers can seamlessly scale up from small models such as BERT (opens in new tab) to the largest models in existence like GPT-3 (opens in new tab).
> World’s largest chip gets beefier: 850 thousand cores for AI (opens in new tab)
> The world’s largest chip is creating AI networks larger than the human brain (opens in new tab)
> The largest CPU in the world just got a massive upgrade (opens in new tab)
Training large models using a GPU is challenging and time-consuming while training from scratch on new datasets often takes weeks and 10s of megawatts of power on large clusters of legacy equipment. Additionally, as the size of the cluster grows, power, cost and complexity grow exponentially.
Cerebras Systems built the CS-2 to address these challenges and its AI system can set up even the largest models in only a few minutes. Since developers spend less time setting up, configuring and training their models with the CS-2, they are able to explore more ideas in even less time.