Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

NVIDIA CEO Jensen Huang onstage at Dell Technologies World, gesturing toward a large screen displaying the Vera…
Big TechScore: 75

NVIDIA Vera Rubin NVL72 Cuts Agentic AI Cost 10x vs Blackwell

NVIDIA Vera Rubin NVL72 cuts agentic AI inference cost 10x vs Blackwell, per Huang at Dell event. 5,000 enterprises already on Dell factories.

·2d ago·3 min read··1 views·AI-Generated·Report error
Share:
Source: blogs.nvidia.comvia nvidia_dc_blogSingle Source
How does NVIDIA Vera Rubin NVL72 reduce cost for agentic AI inference?

NVIDIA's Vera Rubin NVL72, announced at Dell Technologies World, delivers up to 10x lower cost-per-token than Blackwell for agentic AI inference, per CEO Jensen Huang. Dell also introduced PowerEdge XE9812 servers and agent sandboxes 50% faster than CPUs.

TL;DR

NVIDIA Vera Rubin NVL72 delivers 10x lower cost-per-token · 5,000 enterprises run AI on Dell factories · Huang calls demand 'utterly parabolic'

NVIDIA CEO Jensen Huang declared AI demand 'utterly parabolic' at Dell Technologies World. The Vera Rubin NVL72 platform cuts agentic inference cost-per-token 10x versus Blackwell.

Key facts

  • Vera Rubin NVL72 delivers 10x lower cost-per-token than Blackwell
  • 5,000 enterprises run AI on Dell factories with NVIDIA
  • Token consumption projected to grow 3,400% by 2030
  • AI infrastructure spend could reach $3-4 trillion by 2030
  • Vera CPU offers 1.2 TB/s memory bandwidth

Dell Technologies World opened Monday with a joint keynote from Michael Dell and Jensen Huang, centering on the Dell AI Factory with NVIDIA as the platform for enterprise agentic AI. The headline hardware: the Dell PowerEdge XE9812, built on NVIDIA Vera Rubin NVL72, which delivers up to 10x lower cost-per-token than NVIDIA Blackwell for massive-scale agentic AI inferencing [According to NVIDIA's blog post].

Agent sandboxes run 50% faster on NVIDIA Vera than traditional CPUs, while enterprise data queries are up to 3x faster with the Vera CPU. Huang framed the shift: 'We've now arrived at the era of useful AI, which is the reason why demand is going parabolic, utterly parabolic.'

The Vera Rubin Lineup and Enterprise Scale

The PowerEdge XE9812 is joined by three new servers — XE9880L, XE9885L, and XE9882L — the first Dell systems built on NVIDIA HGX Rubin NVL8, supporting up to 144 GPUs per rack with 100% direct liquid-cooled compute nodes and up to 10x the performance of HGX B200. Networking gets the Dell PowerSwitch portfolio with NVIDIA Quantum-X800 InfiniBand and Spectrum-6 Ethernet.

Dell also introduced PowerRack, a fully integrated compute, networking, and storage system engineered as one, aiming to eliminate integration overhead for enterprises. On the CPU side, PowerEdge M9822 and R9822 servers bring NVIDIA Vera CPUs with 1.2 TB/s memory bandwidth, purpose-built for agentic AI workloads.

Michael Dell sized the stakes: worldwide AI infrastructure spending could reach $3-4 trillion by 2030, with token consumption projected to grow 3,400% in the same window. 5,000 enterprises like Lilly, Samsung, and Honeywell are already running AI workloads on Dell AI Factories with NVIDIA.

The Unique Take: Cost Curve Inversion for Agentic Inference

The critical shift here isn't just raw performance — it's the cost-per-token inversion. Prior to Vera Rubin, agentic AI inference was economically prohibitive at scale because each agent loop (plan, tool call, context fetch, response) multiplies token consumption by 5-10x over single-shot generation. Vera Rubin's 10x cost reduction makes the agentic workflow viable where it wasn't before. This aligns with the 3,400% token growth projection: if cost drops 10x, demand expands faster than Moore's Law would predict.

Powering the Next American Century: US Energy Secretary Chris Wright and NVIDIA’s Ian Buck on the Genesis Mission

Huang's 'parabolic' language echoes his earlier dispute with Anthropic CEO Dario Amodei over $1 trillion AI revenue forecasts — Huang called that figure 'too conservative' at GTC 2026. The Vera Rubin NVL72 is the hardware bet that enterprise agentic AI will consume tokens at a rate that justifies $3-4 trillion in infrastructure spend by 2030.

What to watch

Watch for Vera Rubin NVL72 pricing and availability timelines from Dell and NVIDIA in Q3 2026. Also track whether enterprise adoption of agentic AI on Dell factories crosses 10,000 customers, and whether rival platforms like AMD's MI400 or Cerebras WSE-3 respond with comparable cost-per-token claims.

The Next Generation of AI Begins


Sources cited in this article

Source: gentic.news · · author= · citation.json

AI-assisted reporting. Generated by gentic.news from 1 verified source, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

The Vera Rubin NVL72 announcement represents a structural shift in the economics of agentic AI. Prior to this, the industry faced a cost-per-token wall: agentic loops (plan, tool call, context fetch, response) consume 5-10x more tokens than single-shot generation, making them uneconomical for most enterprise use cases. A 10x cost reduction inverts that equation — agentic workflows become cheaper per task than single-shot generation was on prior hardware. This is the core insight that justifies Huang's 'parabolic' demand claim. Dell's positioning as the systems integrator is also notable. By offering PowerRack as a fully integrated system with thermal, power, and software optimization, Dell is targeting the enterprise pain point that has slowed AI adoption: integration overhead. The 5,000 customer count (Lilly, Samsung, Honeywell) suggests momentum, but the real test will be whether that number accelerates post-Vera Rubin. Huang's rhetoric — 'utterly parabolic,' 'what took months now takes weeks' — should be read as a competitive signal. At GTC 2026, he disputed Amodei's $1 trillion forecast as too conservative. The Vera Rubin NVL72 is the hardware that backs up that bullishness. The question is whether supply can match the demand he's projecting, and whether competitors like AMD, Cerebras, or Google TPU can match the cost-per-token curve.
Compare side-by-side
Nvidia vs Dell Technologies
Enjoyed this article?
Share:

AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

Related Articles

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

More in Big Tech

View all