Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

MLPerf 6.0: NVIDIA Sweeps New Benchmarks, AMD MI355X Within 30% on Select Tests
AI ResearchScore: 85

MLPerf 6.0: NVIDIA Sweeps New Benchmarks, AMD MI355X Within 30% on Select Tests

MLPerf 6.0 results show NVIDIA winning every new benchmark, with its GB300 NVL72 system achieving nearly 3x more throughput than six months ago. AMD's MI355X showed progress, coming within 10-30% on select single-node tests but skipping most new benchmarks.

GAla Smith & AI Research Desk·7h ago·5 min read·6 views·AI-Generated
Share:
MLPerf 6.0 Results: NVIDIA Dominates New Benchmarks, AMD Shows Progress on Select Tests

MLPerf 6.0 inference benchmark results are in, and the data delivers a clear verdict on the current state of AI hardware competition: NVIDIA continues to hold a commanding lead, particularly in full-stack system performance, while AMD has made measurable—but narrow—gains on specific, established tests.

The Benchmark Results: A Clean Sweep for NVIDIA

The latest round of industry-standard AI performance testing tells a straightforward story. According to analysis of the published data:

  • NVIDIA won every newly added benchmark it entered. Competitors, including AMD, did not submit results for most of these new tests.

  • On the two new benchmarks where both companies competed, NVIDIA outperformed AMD by approximately 30% and 50%.

  • NVIDIA's full-stack systems showed dramatic software-driven gains. The GB300 NVL72 platform delivered nearly 3x more throughput than it did six months ago on the same hardware, highlighting the impact of continuous software optimization. This system, using 288 B300 GPUs, achieved a rate of 2.5 million tokens per second when running the DeepSeek-R1 model.

AMD's Measured Progress with the MI355X

AMD's latest Instinct MI355X accelerator demonstrated real engineering progress. On select, single-node benchmarks for established models (like Llama and GPT), the MI355X performed within 10-30% of comparable NVIDIA hardware.

"Getting within 10-30% on select single-node tests is no joke. And they deserve credit for that," noted the source analysis. This represents a significant narrowing of the performance gap for these specific workloads compared to previous generations.

However, this progress comes with important context:

  1. AMD did not submit results for the majority of the newly introduced, more demanding benchmarks in MLPerf 6.0.
  2. The company did not compete on what analysts consider "the hardest models" included in this round.
  3. The closest performance figures are on single-node tests; the performance gap widens significantly in large-scale, multi-node system benchmarks where networking and software orchestration become critical.

The Real Battle: Full-Stack AI Infrastructure

The analysis suggests the competition is evolving beyond simple GPU-to-GPU comparisons. The key differentiator is now full-stack AI infrastructure, which encompasses:

  • Networking: High-bandwidth, low-latency interconnects like NVIDIA's Quantum-3 InfiniBand.
  • Software Optimization: Mature, end-to-end software stacks (CUDA, libraries, compilers) that extract maximum performance from the hardware.
  • Disaggregated Serving: The ability to efficiently split massive models across hundreds of GPUs and serve tokens at unprecedented scale.

"The real story isn't GPU vs GPU anymore," the source concludes. "It's full-stack AI infrastructure... And that's where NVIDIA keeps pulling ahead." NVIDIA's results on the GB300 NVL72—a 3x performance boost on unchanged hardware—serve as a prime example of this software and system-level advantage.

gentic.news Analysis

These MLPerf 6.0 results reinforce a competitive dynamic we've tracked for multiple cycles. NVIDIA's strategy of competing on total solution performance—from silicon to software to networking—continues to pay dividends in benchmark leadership. This follows NVIDIA's established pattern of using MLPerf not just to showcase silicon but to demonstrate year-over-year software gains on the same platform, as we saw with the H100's performance improvements throughout 2024 and 2025.

AMD's progress with the MI355X is technically credible and necessary for building a viable alternative ecosystem. However, as our coverage of the MI350X launch last year noted, closing the raw compute gap is only the first step. The larger challenge remains cultivating a software and system ecosystem that can match NVIDIA's decade-plus head start. This result aligns with our previous reporting that while AMD is gaining on single-accelerator performance for inference, the system-scale gap for training and large-scale inference remains substantial.

The data underscores a critical trend for AI infrastructure buyers: vendor lock-in is increasingly a software and networking issue, not just a hardware one. NVIDIA's ability to triple throughput on existing GB300 hardware via software updates creates a powerful economic moat. For AMD to become a true data-center-scale alternative, it must demonstrate similar year-over-year software gains on its platforms and convince developers to port and optimize for its full stack. The MLPerf 7.0 results, expected later this year, will be a crucial test of whether AMD's software velocity is accelerating.

Frequently Asked Questions

What is MLPerf?

MLPerf is a consortium that creates and maintains a suite of standardized benchmarks for measuring the performance of machine learning hardware, software, and services. Its inference benchmarks, used in this round, test how quickly and efficiently systems can run trained AI models to generate predictions or text.

Did AMD beat NVIDIA in MLPerf 6.0?

No. According to the published results and analysis, NVIDIA won every new benchmark it entered. On the two new tests where both companies submitted results, NVIDIA outperformed AMD by roughly 30% and 50%. AMD showed improved performance on a subset of older, single-node benchmarks.

What was NVIDIA's most impressive result in MLPerf 6.0?

The most significant result was NVIDIA's demonstration of a ~3x increase in throughput on its GB300 NVL72 system compared to its results from six months prior, using the same hardware. This highlights the performance gains achievable through software and driver optimization alone on a mature platform.

How close is AMD's MI355X to NVIDIA's performance?

On a narrow set of single-server, single-node benchmarks for established models (like Llama), AMD's MI355X came within 10-30% of comparable NVIDIA hardware. This is a meaningful improvement. However, on larger-scale system benchmarks and newer, more complex models, NVIDIA maintains a much larger lead, and AMD did not submit competitive results for many of these tests.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

The MLPerf 6.0 results validate a strategic divergence in the AI hardware race. NVIDIA is competing—and winning—on **system-scale efficiency and year-over-year software gains**. The 3x throughput improvement on the GB300 platform is a powerful message to hyperscalers: investing in NVIDIA's full stack yields continuous performance dividends, extending the ROI of existing hardware. This software moat is arguably more defensible than a transient silicon lead. For AMD, the 10-30% gap on select tests is a necessary but insufficient milestone. The real challenge, as our previous reporting on the **MI300A's adoption in El Capitan** highlighted, is moving from competitive point solutions to a full-stack ecosystem that developers trust for deploying and scaling production AI workloads. AMD's decision to skip the newest, hardest benchmarks is a pragmatic admission that its software stack isn't yet ready for those battles. The focus now should be on whether AMD can demonstrate similar software-driven performance leaps on its own platforms in future MLPerf rounds, proving its stack is not just catching up but is also rapidly improvable. Looking ahead, the competitive pressure may intensify from another angle: **custom silicon**. As we covered in our analysis of **Amazon's Trainium2 and Inferentia3**, hyperscalers are increasingly willing to build for their own specific workloads if commercial offerings don't provide a clear total-cost-of-ownership advantage. NVIDIA's software-driven performance gains are a direct counter to this trend, while AMD must prove its stack can deliver comparable long-term value.

Mentioned in this article

Enjoyed this article?
Share:

Related Articles

More in AI Research

View all