Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

cerebras

30 articles about cerebras in AI news

Cerebras Claims Performance Parity With Nvidia H100 on AI Training

Cerebras claims wafer-scale chips match Nvidia H100 on AI training performance per watt, challenging Nvidia's dominance.

92% relevant

Cerebras Reengineers Mechanical Playbook for Wafer-Scale Chip Cooling

Cerebras disclosed three mechanical innovations—vertical power delivery, flexible interposers, and direct-impingement cooling—to prevent wafer-scale chips from cracking, rewriting engineering fundamentals.

88% relevant

Cerebras CS4 Stays on 5nm as SRAM Scaling Flattens

Cerebras CS4 stays on 5nm due to SRAM scaling flattening, per @SemiAnalysis_. 3nm offers no density gain, so the chip prioritizes yield and cost.

85% relevant

Cerebras Hits 981 Tokens/sec on 1T-Parameter Kimi K2.6, Claims 6.7× GPU Cloud Speedup

Cerebras reported 981 tokens/sec on the 1T-parameter Kimi K2.6 model, a 6.7× speedup over the next GPU cloud, validated by an independent third party.

93% relevant

Cerebras Challenges Nvidia Inference Monopoly with Wafer-Scale Edge

Cerebras is challenging Nvidia's inference dominance with wafer-scale chips, as inference workloads surpass training in AI compute spend.

70% relevant

Cerebras WSE-3 Claims 10x Training Speed Over Nvidia H100 on GPT-Scale Model

Cerebras claims 10x training speed over Nvidia H100 for GPT-3-scale models using WSE-3. Benchmark lacks power and cost data, limiting independent verification.

64% relevant

Cerebras Shares Open at $385, 108% Above $185 IPO Price

Cerebras opened at $385, 108% above the $185 IPO price, raising $5.5B. The $68B market cap prices in aggressive growth against Nvidia's dominance.

100% relevant

Cerebras IPO Challenges GPU Scaling Orthodoxy

Cerebras filed for IPO on April 21, betting wafer-scale chips can disrupt Nvidia's GPU cluster model for AI workloads.

98% relevant

Cerebras Understates On-Chip SRAM by 8x, SemiAnalysis Notes

Cerebras understates on-chip SRAM by 8x per SemiAnalysis, a rare under-specification in chip marketing.

75% relevant

Cerebras' Strategic Partnership Yields Breakthrough AI Training Results

Cerebras Systems' partnership with Abu Dhabi's G42 has produced remarkable AI training benchmarks, achieving results 100x faster than traditional GPU clusters. The collaboration demonstrates the viability of wafer-scale computing for large language model development.

85% relevant

Beyond Nvidia: How OpenAI's Cerebras-Powered Model Redefines AI Hardware Competition

OpenAI's GPT-5.3-Codex-Spark demonstrates real-time coding capabilities on Cerebras hardware, challenging Nvidia's dominance and signaling a new era of specialized AI infrastructure.

75% relevant

NVIDIA, GENCI Launch AI Factory France Compute Access for Startups

NVIDIA and GENCI launched AI Factory France at VivaTech, giving European startups free access to AI supercomputers. The program includes compute, tools, and expert support for NVIDIA Inception members.

90% relevant

Tensordyne Claims 10x Efficiency Gain with Napier Architecture

Tensordyne claims 10x efficiency over Nvidia in inference with Napier gen, but lacks data or verification.

85% relevant

Qualcomm Launches AI Data Center Program With Hyperscaler Customer

Qualcomm launched an AI data center program with a major hyperscaler customer, targeting inference workloads. Financial terms and partner identity undisclosed.

85% relevant

NVIDIA Blackwell Sweeps MLPerf Training 6.0, GB300 Hits 1.6x Speedup

NVIDIA Blackwell swept MLPerf Training 6.0 across all seven benchmarks. GB300 NVL72 delivered 1.6x speedup over GB200 NVL72 using NVFP4 and 8,192 GPUs.

100% relevant

CoreWeave Beats AWS, Google to First Vera Rubin Rack-Scale Validation

CoreWeave validated Nvidia's Vera Rubin NVL72 at rack scale before hyperscalers, reinforcing its GPU-first strategy.

94% relevant

Google and Blackstone Launch TPU Venture, Challenging Nvidia Dominance

Google and Blackstone launched a TPU venture, financing AI infrastructure outside the hyperscale cloud model. Enterprise buyers get a standalone alternative to Nvidia-dominated GPU clusters.

85% relevant

NVIDIA Vera Rubin NVL72 Cuts Agentic AI Cost 10x vs Blackwell

NVIDIA Vera Rubin NVL72 cuts agentic AI inference cost 10x vs Blackwell, per Huang at Dell event. 5,000 enterprises already on Dell factories.

95% relevant

SemiAnalysis: Perplexity Slack Bot Beats Claude in Internal Trial

SemiAnalysis found Perplexity's Slack bot beats Claude in internal trial. 96% token budget goes to Anthropic, but usage may shift.

75% relevant

Cerebra's Tokenomics Bet: AWS, OpenAI Deals and Wafer-Scale Edge

Cerebra's tokenomics pricing and AWS/OpenAI partnerships challenge NVIDIA's inference dominance, offering a 5x cost reduction per token via its wafer-scale architecture.

89% relevant

Perplexity Claims 3x Blackwell Inference Throughput for 70B Models

Perplexity AI claims 3x inference throughput for 70B models on Nvidia Blackwell GPUs via FP4 and custom scheduling. The gain exceeds Nvidia's own 2x marketing claim.

85% relevant

Inference shift opens door for AI chip startups to challenge Nvidia

Inference shift from training to serving creates opportunities for AI chip startups. Nvidia's $20B Groq acquihire validates disaggregated compute strategies.

96% relevant

Google Opens TPU Sales to Select Customers, Raises Capex Forecast

Google sells TPUs to select customers, raising capex forecast for Q1 FY2026, monetizing in-house chips beyond Cloud.

100% relevant

Nvidia B200 Costs $6,400 to Produce, Gross Margin Hits 82%

Epoch AI estimates Nvidia's B200 GPU costs $5,700–$7,300 to produce, with HBM memory and advanced packaging accounting for two-thirds of the cost. At a $30k–$40k sale price, chip-level gross margins reach ~82%, though rack-scale margins may be lower.

100% relevant

AI Chip Capacity Crisis: 10GW Left Through 2030, Prices Up Double Digits

The AI accelerator market has only 10 gigawatts of capacity left for contract through 2030, with 100GW already under contract. Prices are rising double digits as one competitor has stopped taking orders entirely.

97% relevant

Google's Virgo Network Links 134,000 TPU v8 Chips with 47 Pbps Fabric

Google unveiled its Virgo networking stack for TPU v8, capable of linking 134,000 chips in a single fabric with 47 petabits/sec of bi-sectional bandwidth. This represents a massive scale-up in interconnect technology for large-scale AI model training.

100% relevant

DARPA Leases 50 Nvidia H100 GPUs for Biological AI Program

DARPA's Biological Technologies Office is procuring 50 Nvidia HGX H100 GPU systems for its NODES program, with hardware delivery required within one month. This represents a significant government investment in AI infrastructure for biological research applications.

86% relevant

Nvidia's Silicon Photonics Roadmap Targets AI Data Center Bottlenecks

Nvidia is developing its own silicon photonics-based interconnects to address the growing data transfer bottleneck within AI data centers and supercomputers. This move is critical as AI model size and cluster scale continue to grow exponentially.

86% relevant

Gur Singh Claims 7 M4 MacBooks Match A100, Calls Cloud GPU Training a 'Scam'

Developer Gur Singh posted that seven M4 MacBooks (2.9 TFLOPS each) match an NVIDIA A100's performance, calling cloud GPU training a 'scam' and advocating for distributed, consumer-hardware approaches.

77% relevant

Jensen Huang: Nvidia is a 'Computing Company,' Not a Car

Nvidia CEO Jensen Huang, in a new interview, argued that Nvidia is a 'computing company' and not a car—a product that can be easily interchanged. This distinction underscores Nvidia's strategy to be the indispensable platform for AI infrastructure.

85% relevant