CoreWeave trained DeepSeek-V3 in approximately 2 minutes on MLPerf Training v6.0, using over 11,000 NVIDIA H100 GPUs across 4 data centers. The result beats the prior record of 3.5 minutes set by AWS in MLPerf v5.1, per Business Wire.
Key facts
- DeepSeek-V3 trained in ~2 minutes on MLPerf v6.0.
- Over 11,000 NVIDIA H100 GPUs used across 4 data centers.
- Beats prior record of 3.5 minutes (AWS, MLPerf v5.1).
- First cloud provider to validate Nvidia Vera Rubin NVL72.
- CoreWeave building 1.2 GW data center campus in Texas.
CoreWeave trained DeepSeek-V3 in approximately 2 minutes on MLPerf Training v6.0, using over 11,000 NVIDIA H100 GPUs across 4 data centers. The result beats the prior record of 3.5 minutes set by AWS in MLPerf v5.1, per Business Wire. CoreWeave is the first cloud provider to validate and deploy Nvidia Vera Rubin NVL72 at rack scale, a milestone announced earlier this month.
The company did not disclose the exact GPU count or total training cost for the MLPerf run. MLPerf Training v6.0 includes new benchmarks for large language models and multimodal systems. CoreWeave's submission used a proprietary orchestration layer that dynamically allocated compute across its network.
Why This Matters More Than the Press Release Suggests
CoreWeave's 2-minute result is not just a speed record — it signals a structural shift in AI training economics. At scale, the marginal cost per training run drops dramatically when you can saturate thousands of GPUs across sites. This puts pressure on hyperscalers like AWS, Google Cloud, and Azure to match both latency and cost efficiency. CoreWeave's edge comes from its purpose-built infrastructure for GPU workloads, not general-purpose cloud services. The company also recently committed to building a 1.2 GW data center campus in Texas, per public filings.
The DeepSeek-V3 model itself is notable: it achieved frontier performance at roughly 1/10th the training cost of GPT-4, according to its original paper. CoreWeave's result validates that the model can be trained on a distributed, multi-datacenter setup without significant overhead — a proof point for open-weight models in production.
What to Watch
Watch for CoreWeave's Q3 2026 revenue disclosure and whether it discloses the actual GPU-hour cost of the MLPerf run. The company's IPO filing, expected later this year, will reveal unit economics. Also track whether AWS or Google Cloud respond with sub-2-minute submissions in MLPerf v6.1.
Source: news.google.com






