mlperf

27 articles about mlperf in AI news

NVIDIA Blackwell Sweeps MLPerf Training 6.0, GB300 Hits 1.6x Speedup

NVIDIA Blackwell swept MLPerf Training 6.0 across all seven benchmarks. GB300 NVL72 delivered 1.6x speedup over GB200 NVL72 using NVFP4 and 8,192 GPUs.

Jun 16, 2026100% relevant

CoreWeave Trains DeepSeek-V3 in 2 Minutes, Claims MLPerf v6.0 Record

CoreWeave trained DeepSeek-V3 in ~2 minutes on MLPerf v6.0, beating AWS's record by 43% using 11K+ H100 GPUs across 4 data centers.

Jun 16, 2026100% relevant

MLPerf 6.0: NVIDIA Sweeps New Benchmarks, AMD MI355X Within 30% on Select Tests

MLPerf 6.0 results show NVIDIA winning every new benchmark, with its GB300 NVL72 system achieving nearly 3x more throughput than six months ago. AMD's MI355X showed progress, coming within 10-30% on select single-node tests but skipping most new benchmarks.

Apr 7, 202685% relevant

Nvidia Claims MLPerf Inference v6.0 Records with 288-GPU Blackwell Ultra Systems, Highlights 2.7x Software Gains

MLCommons released MLPerf Inference v6.0 results, introducing multimodal and video model tests. Nvidia set records using 288-GPU Blackwell Ultra systems and achieved a 2.7x performance jump on DeepSeek-R1 via software optimizations alone.

Apr 2, 202695% relevant

KV Cache Offload Makes Storage the New AI Bottleneck

Storage, driven by KV cache offload and rising SSD costs, is now the primary AI bottleneck per Supermicro and SemiAnalysis.

Jul 25, 202687% relevant

GPT-5.6 Sol on Cerebras Hits 750 Token/s

GPT-5.6 Sol on Cerebras claimed at 750 token/s, but no official data or model release exists. Unverified claim needs vendor confirmation.

Jul 18, 202697% relevant

China's 14nm AI Chip Hits 520 TFLOPS Via Architecture, Not Shrink

China's 14nm AI chip claims 520 TFLOPS and 6.4TB/s bandwidth via software-defined and 3D near-memory architecture, bypassing advanced node restrictions.

Jul 14, 2026100% relevant

NVIDIA Blackwell Cuts DeepSeek V4 Token Costs 5x in One Month

NVIDIA claims Blackwell inference stack cut DeepSeek V4 token costs 5x in one month, per a newly published report shared by @rohanpaul_ai.

Jun 30, 2026100% relevant

Etched Hits $5B Valuation, $1B in Orders for AI Inference Chip

Etched hits $5B valuation with $1B in orders for TSMC-made inference chips, raising $500M from top investors. The startup targets Nvidia's dominance.

Jun 30, 2026100% relevant

NVIDIA Vera Rubin: One Rack Matches TOP500, 35 EU Labs Deploy

NVIDIA's Vera Rubin NVL72 delivers TOP500-class performance in a single rack, with 35 European labs deploying the system for AI and HPC.

Jun 23, 202695% relevant

JUPITER Exascale Maps Brain at Cellular Scale on 4,096 Grace Hopper Nodes

JUPITER, Europe's first exascale supercomputer, trained CytoNet brain model on 6.5 PB in 5 days and runs climate, 6G, and quantum simulations.

Jun 22, 202685% relevant

Zalando Introduces MLLM-Based Evaluation for Product Retrieval

Zalando presents a multimodal LLM-based evaluation for product retrieval, aiming to enhance search relevance in e-commerce. This matters as it could set a new standard for assessing AI in retail search.

Jun 21, 202692% relevant

Intel Targets Nvidia, AMD with New AI Chip Launch by End 2026

Intel plans to launch a new AI data center chip by end of 2026, targeting Nvidia and AMD in the AI infrastructure market.

Jun 20, 202672% relevant

Canada Deploys Grace Blackwell via $220M Bell-Cohere Deal

Canada's $220M Bell-Cohere deal puts Grace Blackwell on domestic soil for sovereign AI, reducing reliance on US cloud providers.

Jun 20, 202695% relevant

AWS Beats Cloud Rivals to NVIDIA Blackwell with EC2 G7 — 4.6x AI Inference Gain Over G6

AWS launched EC2 G7 instances on June 19, 2026, becoming the first major cloud to offer NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs. The instances claim 4.6x AI inference performance over G6, backed by 700 Gbps EFA networking and 32 GB GDDR7 per GPU. The move arrives the same week AWS confirme

Jun 18, 202685% relevant

Lansing AI data center petition hits 20,000 signatures

Over 20,000 signatures oppose a Lansing AI data center tied to Nvidia's Vera Rubin build-out, signaling growing local resistance.

Jun 18, 202685% relevant

Tensordyne Claims 10x Efficiency Gain with Napier Architecture

Tensordyne claims 10x efficiency over Nvidia in inference with Napier gen, but lacks data or verification.

Jun 18, 202685% relevant

Amazon, Nvidia, AMD Lead $310M Odyssey ML Round at $1.45B Valuation

Odyssey ML raised $310M at $1.45B from Amazon, Nvidia, AMD to build 3D world models simulating physics beyond LLMs.

Jun 17, 202696% relevant

Cerebras Claims Performance Parity With Nvidia H100 on AI Training

Cerebras claims wafer-scale chips match Nvidia H100 on AI training performance per watt, challenging Nvidia's dominance.

Jun 13, 202692% relevant

China Launches Photonics Lab to Bypass US Chip Curbs on AI

China launched a photonics lab to bypass US chip curbs and develop energy-efficient AI computing using light instead of electrons.

Jun 12, 202695% relevant

mlx-vlm v0.6.2 Adds Gemma 4 QAT Support for Local GPUs

mlx-vlm v0.6.2 adds launch-day support for Google DeepMind's Gemma 4 QAT checkpoints, enabling local inference on consumer GPUs and edge devices with video input for the 12B model.

Jun 5, 2026100% relevant

Google’s Virgo network interconnects 134K TPUv8t chips at 47 Pbps

Google's Virgo network interconnects 134,400 TPUv8t chips at 47 Pbps, targeting large-scale training clusters.

Jun 3, 2026100% relevant

Cerebras Hits 981 Tokens/sec on 1T-Parameter Kimi K2.6, Claims 6.7× GPU Cloud Speedup

Cerebras reported 981 tokens/sec on the 1T-parameter Kimi K2.6 model, a 6.7× speedup over the next GPU cloud, validated by an independent third party.

May 23, 202693% relevant

Perplexity Claims 3x Blackwell Inference Throughput for 70B Models

Perplexity AI claims 3x inference throughput for 70B models on Nvidia Blackwell GPUs via FP4 and custom scheduling. The gain exceeds Nvidia's own 2x marketing claim.

May 12, 202685% relevant

MLX-Benchmark Suite Launches as First Comprehensive LLM Eval for Apple Silicon

The MLX-Benchmark Suite has been released as the first comprehensive evaluation framework for Large Language Models running on Apple's MLX framework. It provides standardized metrics for models optimized for Apple Silicon hardware.

Apr 18, 202685% relevant

NVIDIA Spotlights Physical AI Tools for Robotics Week 2026

NVIDIA is highlighting its platforms for robot simulation, synthetic data, and AI-powered learning during National Robotics Week 2026, aiming to accelerate the transition from virtual training to physical deployment.

Apr 4, 2026100% relevant

Beyond the Hype: The New Open Benchmark Putting Every AI Code Review Tool to the Test

A new open benchmarking platform allows developers to test their custom AI code review bots against eight leading commercial tools using real-world data. This transparent approach moves beyond marketing claims to provide objective performance comparisons.

Feb 24, 202685% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety