cerebras
30 articles about cerebras in AI news
Cerebras Claims Performance Parity With Nvidia H100 on AI Training
Cerebras claims wafer-scale chips match Nvidia H100 on AI training performance per watt, challenging Nvidia's dominance.
Cerebras Reengineers Mechanical Playbook for Wafer-Scale Chip Cooling
Cerebras disclosed three mechanical innovations—vertical power delivery, flexible interposers, and direct-impingement cooling—to prevent wafer-scale chips from cracking, rewriting engineering fundamentals.
Cerebras CS4 Stays on 5nm as SRAM Scaling Flattens
Cerebras CS4 stays on 5nm due to SRAM scaling flattening, per @SemiAnalysis_. 3nm offers no density gain, so the chip prioritizes yield and cost.
Cerebras Hits 981 Tokens/sec on 1T-Parameter Kimi K2.6, Claims 6.7× GPU Cloud Speedup
Cerebras reported 981 tokens/sec on the 1T-parameter Kimi K2.6 model, a 6.7× speedup over the next GPU cloud, validated by an independent third party.
Cerebras Challenges Nvidia Inference Monopoly with Wafer-Scale Edge
Cerebras is challenging Nvidia's inference dominance with wafer-scale chips, as inference workloads surpass training in AI compute spend.
Cerebras WSE-3 Claims 10x Training Speed Over Nvidia H100 on GPT-Scale Model
Cerebras claims 10x training speed over Nvidia H100 for GPT-3-scale models using WSE-3. Benchmark lacks power and cost data, limiting independent verification.
Cerebras Shares Open at $385, 108% Above $185 IPO Price
Cerebras opened at $385, 108% above the $185 IPO price, raising $5.5B. The $68B market cap prices in aggressive growth against Nvidia's dominance.
Cerebras IPO Challenges GPU Scaling Orthodoxy
Cerebras filed for IPO on April 21, betting wafer-scale chips can disrupt Nvidia's GPU cluster model for AI workloads.
Cerebras Understates On-Chip SRAM by 8x, SemiAnalysis Notes
Cerebras understates on-chip SRAM by 8x per SemiAnalysis, a rare under-specification in chip marketing.
Cerebras' Strategic Partnership Yields Breakthrough AI Training Results
Cerebras Systems' partnership with Abu Dhabi's G42 has produced remarkable AI training benchmarks, achieving results 100x faster than traditional GPU clusters. The collaboration demonstrates the viability of wafer-scale computing for large language model development.
Beyond Nvidia: How OpenAI's Cerebras-Powered Model Redefines AI Hardware Competition
OpenAI's GPT-5.3-Codex-Spark demonstrates real-time coding capabilities on Cerebras hardware, challenging Nvidia's dominance and signaling a new era of specialized AI infrastructure.
NVIDIA, GENCI Launch AI Factory France Compute Access for Startups
NVIDIA and GENCI launched AI Factory France at VivaTech, giving European startups free access to AI supercomputers. The program includes compute, tools, and expert support for NVIDIA Inception members.
Tensordyne Claims 10x Efficiency Gain with Napier Architecture
Tensordyne claims 10x efficiency over Nvidia in inference with Napier gen, but lacks data or verification.
Qualcomm Launches AI Data Center Program With Hyperscaler Customer
Qualcomm launched an AI data center program with a major hyperscaler customer, targeting inference workloads. Financial terms and partner identity undisclosed.
NVIDIA Blackwell Sweeps MLPerf Training 6.0, GB300 Hits 1.6x Speedup
NVIDIA Blackwell swept MLPerf Training 6.0 across all seven benchmarks. GB300 NVL72 delivered 1.6x speedup over GB200 NVL72 using NVFP4 and 8,192 GPUs.
CoreWeave Beats AWS, Google to First Vera Rubin Rack-Scale Validation
CoreWeave validated Nvidia's Vera Rubin NVL72 at rack scale before hyperscalers, reinforcing its GPU-first strategy.
Google and Blackstone Launch TPU Venture, Challenging Nvidia Dominance
Google and Blackstone launched a TPU venture, financing AI infrastructure outside the hyperscale cloud model. Enterprise buyers get a standalone alternative to Nvidia-dominated GPU clusters.
NVIDIA Vera Rubin NVL72 Cuts Agentic AI Cost 10x vs Blackwell
NVIDIA Vera Rubin NVL72 cuts agentic AI inference cost 10x vs Blackwell, per Huang at Dell event. 5,000 enterprises already on Dell factories.
SemiAnalysis: Perplexity Slack Bot Beats Claude in Internal Trial
SemiAnalysis found Perplexity's Slack bot beats Claude in internal trial. 96% token budget goes to Anthropic, but usage may shift.
Cerebra's Tokenomics Bet: AWS, OpenAI Deals and Wafer-Scale Edge
Cerebra's tokenomics pricing and AWS/OpenAI partnerships challenge NVIDIA's inference dominance, offering a 5x cost reduction per token via its wafer-scale architecture.
Perplexity Claims 3x Blackwell Inference Throughput for 70B Models
Perplexity AI claims 3x inference throughput for 70B models on Nvidia Blackwell GPUs via FP4 and custom scheduling. The gain exceeds Nvidia's own 2x marketing claim.
Inference shift opens door for AI chip startups to challenge Nvidia
Inference shift from training to serving creates opportunities for AI chip startups. Nvidia's $20B Groq acquihire validates disaggregated compute strategies.
Google Opens TPU Sales to Select Customers, Raises Capex Forecast
Google sells TPUs to select customers, raising capex forecast for Q1 FY2026, monetizing in-house chips beyond Cloud.
Nvidia B200 Costs $6,400 to Produce, Gross Margin Hits 82%
Epoch AI estimates Nvidia's B200 GPU costs $5,700–$7,300 to produce, with HBM memory and advanced packaging accounting for two-thirds of the cost. At a $30k–$40k sale price, chip-level gross margins reach ~82%, though rack-scale margins may be lower.
AI Chip Capacity Crisis: 10GW Left Through 2030, Prices Up Double Digits
The AI accelerator market has only 10 gigawatts of capacity left for contract through 2030, with 100GW already under contract. Prices are rising double digits as one competitor has stopped taking orders entirely.
Google's Virgo Network Links 134,000 TPU v8 Chips with 47 Pbps Fabric
Google unveiled its Virgo networking stack for TPU v8, capable of linking 134,000 chips in a single fabric with 47 petabits/sec of bi-sectional bandwidth. This represents a massive scale-up in interconnect technology for large-scale AI model training.
DARPA Leases 50 Nvidia H100 GPUs for Biological AI Program
DARPA's Biological Technologies Office is procuring 50 Nvidia HGX H100 GPU systems for its NODES program, with hardware delivery required within one month. This represents a significant government investment in AI infrastructure for biological research applications.
Nvidia's Silicon Photonics Roadmap Targets AI Data Center Bottlenecks
Nvidia is developing its own silicon photonics-based interconnects to address the growing data transfer bottleneck within AI data centers and supercomputers. This move is critical as AI model size and cluster scale continue to grow exponentially.
Gur Singh Claims 7 M4 MacBooks Match A100, Calls Cloud GPU Training a 'Scam'
Developer Gur Singh posted that seven M4 MacBooks (2.9 TFLOPS each) match an NVIDIA A100's performance, calling cloud GPU training a 'scam' and advocating for distributed, consumer-hardware approaches.
Jensen Huang: Nvidia is a 'Computing Company,' Not a Car
Nvidia CEO Jensen Huang, in a new interview, argued that Nvidia is a 'computing company' and not a car—a product that can be easily interchanged. This distinction underscores Nvidia's strategy to be the indispensable platform for AI infrastructure.