News

At over 2,500 t/s, Cerebras claims to have has set a world record for LLM inference speed on the 400B parameter Llama 4 ...
Nvidia announced that 8 Blackwell GPUs in a DGX B200 could demonstrate 1,000 tokens per second (TPS) per user on Meta’s Llama ...
A data center in Oklahoma City, Oklahoma, has been sold and looks likely to serve chip firm Cerebras. CoStar reports Scale ...
Cerebras launched its AI inference service last August. Inference refers to the process of running live data through a trained AI model to make a prediction or solve a task, and high performance is ...
Cerebras has fastest time to first answer token for Qwen-3 32B – source: Artificial Analysis Cerebras has fastest output speed at 2,403 tokens/sec for Qwen-3 32B – source: Artificial Analysis ...
Cerebras Systems is adding six new AI data centers in North America and Europe. This will increase inference capacity to over 40 million tokens per second. The new facilities will be established in ...
Cerebras has fastest time to first answer token for Qwen-3 32B – source: Artificial Analysis Cerebras has fastest output speed at 2,403 tokens/sec for Qwen-3 32B – source: Artificial Analysis ...
Cerebras CEO Andrew Feldman said his hope is to take his company public in 2025 now that the chipmaker has obtained clearance from the U.S. government to sell shares to an entity in the United ...