Cerebras Systems, maker of the world’s largest chip, announced that its CS-2 system now supports PyTorch and TensorFlow, allowing researchers to quickly and easily train models with billions of parameters.
The company’s CS-2 is the world’s fastest AI system and is powered by its Wafer-Scale Engine 2 (WSE-2) processor. With the release of version 1.2 of the Cerebras Software Platform (CSoft), the CS-2 now supports additional machine learning frameworks that will give developers even more choice when it comes to the types of models that they wish to execute.
Emad Barsoum, Senior Director of AI Framework at Cerebras Systems, explained in a press release how CSoft now allows developers to express models written in TensorFlow or PyTorch, saying:
“From the start, our goal was to seamlessly support the machine learning framework that our customers wanted to write in. Our customers write in TensorFlow and in PyTorch, and our software stack, CSoft, makes it possible to express quickly and easily your models in the framework of your choice. In doing so, our customers have access to the 850,000 AI-optimized cores and 40 gigabytes of on-chip memory of the Cerebras CS-2.”
Scaling large language models
CSoft version 1.2 now allows developers to write their models in the open source frameworks of PyTorch or TensorFlow and run them on the Cerebras CS-2 without any modifications. At the same time, an AI model written for a GPU or CPU can run in CSoft on the CS-2 without any changes.
With the combined power of CS-2 and CSoft, developers can scale from small models such as BERT to larger existing models such as GPT-3.
Training large models using a GPU is difficult and time-consuming, while training from scratch on new datasets often takes weeks and tens of megawatts of power on large computer clusters. legacy equipment. Additionally, as the cluster size increases, the power, cost, and complexity increase exponentially.
Cerebras Systems built the CS-2 to meet these challenges, and its AI system can configure even the largest models in just minutes. Since developers spend less time installing, configuring and training their models with the CS-2, they are able to explore more ideas in even less time.