NVIDIA CEO Jensen Huang declared AI demand 'utterly parabolic' at Dell Technologies World. The Vera Rubin NVL72 platform cuts agentic inference cost-per-token 10x versus Blackwell.
Key facts
- Vera Rubin NVL72 delivers 10x lower cost-per-token than Blackwell
- 5,000 enterprises run AI on Dell factories with NVIDIA
- Token consumption projected to grow 3,400% by 2030
- AI infrastructure spend could reach $3-4 trillion by 2030
- Vera CPU offers 1.2 TB/s memory bandwidth
Dell Technologies World opened Monday with a joint keynote from Michael Dell and Jensen Huang, centering on the Dell AI Factory with NVIDIA as the platform for enterprise agentic AI. The headline hardware: the Dell PowerEdge XE9812, built on NVIDIA Vera Rubin NVL72, which delivers up to 10x lower cost-per-token than NVIDIA Blackwell for massive-scale agentic AI inferencing [According to NVIDIA's blog post].
Agent sandboxes run 50% faster on NVIDIA Vera than traditional CPUs, while enterprise data queries are up to 3x faster with the Vera CPU. Huang framed the shift: 'We've now arrived at the era of useful AI, which is the reason why demand is going parabolic, utterly parabolic.'
The Vera Rubin Lineup and Enterprise Scale
The PowerEdge XE9812 is joined by three new servers — XE9880L, XE9885L, and XE9882L — the first Dell systems built on NVIDIA HGX Rubin NVL8, supporting up to 144 GPUs per rack with 100% direct liquid-cooled compute nodes and up to 10x the performance of HGX B200. Networking gets the Dell PowerSwitch portfolio with NVIDIA Quantum-X800 InfiniBand and Spectrum-6 Ethernet.
Dell also introduced PowerRack, a fully integrated compute, networking, and storage system engineered as one, aiming to eliminate integration overhead for enterprises. On the CPU side, PowerEdge M9822 and R9822 servers bring NVIDIA Vera CPUs with 1.2 TB/s memory bandwidth, purpose-built for agentic AI workloads.
Michael Dell sized the stakes: worldwide AI infrastructure spending could reach $3-4 trillion by 2030, with token consumption projected to grow 3,400% in the same window. 5,000 enterprises like Lilly, Samsung, and Honeywell are already running AI workloads on Dell AI Factories with NVIDIA.
The Unique Take: Cost Curve Inversion for Agentic Inference
The critical shift here isn't just raw performance — it's the cost-per-token inversion. Prior to Vera Rubin, agentic AI inference was economically prohibitive at scale because each agent loop (plan, tool call, context fetch, response) multiplies token consumption by 5-10x over single-shot generation. Vera Rubin's 10x cost reduction makes the agentic workflow viable where it wasn't before. This aligns with the 3,400% token growth projection: if cost drops 10x, demand expands faster than Moore's Law would predict.

Huang's 'parabolic' language echoes his earlier dispute with Anthropic CEO Dario Amodei over $1 trillion AI revenue forecasts — Huang called that figure 'too conservative' at GTC 2026. The Vera Rubin NVL72 is the hardware bet that enterprise agentic AI will consume tokens at a rate that justifies $3-4 trillion in infrastructure spend by 2030.
What to watch
Watch for Vera Rubin NVL72 pricing and availability timelines from Dell and NVIDIA in Q3 2026. Also track whether enterprise adoption of agentic AI on Dell factories crosses 10,000 customers, and whether rival platforms like AMD's MI400 or Cerebras WSE-3 respond with comparable cost-per-token claims.










