Hold on: World's fastest AI chip will massively accelerate AI progress

Hold on: World's fastest AI chip will massively accelerate AI progress

Spread the love

Whether you see AI as an incredible tool with massive benefits or a societal ill that only benefits massive tools, a powerful new chip can train them faster than ever. Cerebras Systems has unveiled the world’s fastest AI chip – the Wafer Scale Engine 3 (WSE-3), which powers the Cerebras CS-3 AI supercomputer with a peak performance of 125 petaFLOPS. And it’s scalable to an insane degree.

Before an AI system can churn out a cute-but-uncanny little video of a cat waking its owner, it needs to be trained on a frankly staggering amount of data, consuming over 100 households of energy in the process. But the new chip, and computers built with it, will help speed up that process and make it more efficient.

Each WSE-3 chip, about the size of a pizza box, packs an astounding four trillion transistors, providing twice the performance of the company’s previous model (which was also the previous world record-holder) for the same cost and power draw. When these are bundled into the CS-3 system, they can apparently provide the performance of a room full of servers within a single unit the size of a mini-fridge.

Cerebras says the CS-3 is running 900,000 AI cores and 44 GB of on-chip SRAM, providing up to 125 petaFLOPS of peak AI performance. In theory that should be enough grunt to land it among the top 10 supercomputers in the world – although of course it hasn’t been tested on those benchmarks, so we can’t be sure how well it would actually perform.

To store all that data, external memory options include 1.5 TB, 12 TB or a massive 1.2 Petabytes (PB), which is 1,200 TB. The CS-3 can train AI models involving up to 24 trillion parameters – by comparison, most AI models are currently in the billions of parameters, with GPT-4 estimated to top out around 1.8 trillion. Cerebras says that the CS-3 should be able to train a one-trillion-parameter model as easily as current GPU-based computers train a measly one-billion-parameter model.

Thanks to the wafer production process of the WSE-3 chips, the CS-3 is designed to be scalable, allowing up to 2,048 units to be clustered together into one barely fathomable supercomputer. This would be capable of up to 256 exaFLOPS, where the top supercomputers in the world currently are still playing with a little over one exaFLOP. That kind of power would let it train a Llama 70B model from scratch in just one day, the company claims.

It already feels like AI models are advancing at a terrifying rate, but this kind of tech is only going to crank the firehose up even higher. No matter what work you do, it seems AI systems will be coming for your jobs and even your hobbies faster than ever.

Source: Cerebras [1],[2]

Related News