
SemiAnalysis Calls Jensen ComputeX Keynote 'F Tier' Over No AI DC News
SemiAnalysis rated Jensen Huang's ComputeX keynote 'F Tier' for no AI datacenter news and revealed a delayed NVIDIA ARM chip with broken video output.
Stargate is racing to 7 GW across five new sites. Meta's 5 GW Hyperion in Louisiana spans 11 buildings. xAI's Colossus 2 is targeting 1 million GPUs. Microsoft Fairwater is projected past $100 B. The bottleneck is no longer money or chips — it’s power, grid interconnects, and gas-turbine wait times. This hub is everything we know about the build-out: a 12-lesson technical curriculum, a 205-term glossary, hands-on calculators, named-campus case studies, and live news from the people actually building it.
Auto-generated from the Lab’s knowledge graph. Findings are produced every 12 hours by our agentic research pipeline.
Stories that moved the graph
From "what's a rack" to "design a 100MW AI campus" — learn at your own pace.
What a data center actually is, the four-layer Tier classification (Uptime Institute), the components inside a single rack, and why AI changed everything.
From the utility substation to the chip: high-voltage interconnects, UPS systems, generators, PDUs, and the 100MW+ scale that AI demands.
Air, liquid, immersion. CRAC vs CDU, direct-liquid cooling for GPUs, PUE/WUE math, and why every modern AI rack is liquid-cooled.
NVIDIA H100/H200/B200/GB200 NVL72, AMD MI300X, Google TPU v5p, AWS Trainium2, Cerebras WSE-3. Real specs, real interconnects.
InfiniBand vs Ethernet (Ultra Ethernet Consortium), NVLink/NVSwitch, optical transceivers, CLOS topology, and rail-optimized layouts.
Parallel filesystems (Lustre, WekaFS, VAST), NVMe-oF, checkpoint strategies, and how 100k-GPU clusters move 100GB/s.
SLURM vs Kubernetes for AI, Run.AI, NVIDIA Base Command, gang scheduling, fault tolerance, and the orchestration stack on top of bare metal.
Site selection, permitting, 18-36 month construction timelines, vendor selection, and the realistic capex of a 100MW AI campus.
DCIM, BMS, capacity planning, incident response, the day-to-day of running a critical facility at 99.99% uptime.
PUE/WUE/CUE, hyperscaler net-zero pledges, geothermal partnerships, heat reuse for district heating, water positivity.
$/MW capex, opex breakdown, neocloud business models (CoreWeave, Lambda, Crusoe), depreciation cycles, and why CapEx is exploding.
Roles, salaries, certifications (Uptime Institute CDCP/CDCS/CDCE, BICSI RCDD), training programs, and the career ladder.
Refreshed every 12h by the Living Agent — analyzing fresh DC news for operator velocity, tech trends, and weekly shifts. Last updated 7h ago.
📝 What changed this week
CoreWeave receives first Nvidia Vera Rubin NVL72 rack — Dell ships production unit ahead of schedule, signaling Nvidia’s accelerated next-gen platform transition. Implication: Vera Rubin NVL72 may compress Blackwell’s lifespan, altering hyperscale procurement cycles and secondary-market GPU availability.
xAI abandons JAX, builds custom C training framework — Decision driven by sub-10% Model FLOPs Utilization (MFU) on existing stacks. Implication: Major hyperscaler is now vertically integrating training software, potentially fragmenting the ML compiler ecosystem and pressuring JAX/PyTorch maintainers to close performance gaps.
Blackwell NVLink breaks confidential compute, 61% regression — NVLink integrity checks expose large overhead when enabling TEE. Implication: Multi-tenant GPU clusters for sensitive workloads (finance, healthcare) face severe throughput penalties, possibly delaying Blackwell adoption in regulated verticals.
Huawei achieves 1.5µm bond pitch on Kirin 2026 — Beats TSMC’s 1.8µm in hybrid bonding. Implication: Chinese AI chip supply chain may bypass leading-edge lithography constraints via advanced packaging, threatening Nvidia’s domestic market share and prompting export control revisions.
ERCOT data center requests exceed grid capacity by 5x — Texas grid faces unprecedented interconnection backlog. Implication: Hyperscalers may pivot to on-site generation (gas turbines, small modular reactors) or shift buildout to less constrained regions, reshaping DC site-selection economics.
AI data centers hit 2M gallons per day per campus water wall — Cooling demand strains municipal water supplies. Implication: Operators face regulatory pushback and rising water costs; adoption of direct-to-chip liquid cooling and closed-loop systems becomes a competitive necessity, not an option.
🚀 Trending hardware/tech
Curated by an autonomous agent reading live RSS + entity mentions. Rankings reflect actual coverage frequency, not editorial choice.
Filtered for substance: hardware specs, topology, MW, MFU. Press fluff demoted.

SemiAnalysis rated Jensen Huang's ComputeX keynote 'F Tier' for no AI datacenter news and revealed a delayed NVIDIA ARM chip with broken video output.

Dell delivered the first Nvidia Vera Rubin NVL72 rack to CoreWeave. Each rack packs 72 Rubin GPUs, 36 Vera CPUs, 3.6 exaFLOPS FP4 inference, 75 TB memory, and 260 TB/s NVLink bandwidth.

NVIDIA Blackwell confidential computing disables NVLink multicast, causing 61% regression on SGLang Qwen3.5 397B. Hopper had unencrypted NVLink, compounding the issue.

ERCOT datacenter requests far exceed grid underwriting capacity, per @SemiAnalysis_, revealing grid approval as a binding constraint on AI infrastructure buildout.

Water capacity is now a siting gatekeeper for AI data centers. A Virginia campus requested 2M gallons per day; Georgia told a 6 MGD project 'we just don't have the water.'

Google and Blackstone launched a TPU venture, financing AI infrastructure outside the hyperscale cloud model. Enterprise buyers get a standalone alternative to Nvidia-dominated GPU clusters.
Hyperscalers, neoclouds, and colocation providers powering frontier AI.
| Operator | Type | Power (MW) | Notable Site | Specialty |
|---|---|---|---|---|
| Microsoft Azure | Hyperscaler | 5,000 | Fairwater, WI · projected $100B+ build | OpenAI compute partner · GB200 at scale |
| Google Cloud | Hyperscaler | 4,500 | Council Bluffs, IA · The Dalles, OR · Kronstorf, AT | TPU pods + Gemini training · 5GW Anthropic deal |
| Amazon AWS | Hyperscaler | 4,000 | Project Rainier, New Carlisle IN · 2.2 GW | Trainium2/3 powering Anthropic |
| Meta | Hyperscaler | 5,000 | Hyperion, Richland Parish LA · 5 GW · 11 buildings | Llama + MTIA · Prometheus, Ohio coming May 2026 |
| OpenAI / Stargate | Hyperscaler | 7,000 | Abilene, TX · 1.2 GW by mid-2026 · 5 new sites announced | Oracle + SoftBank · 10 GW by 2027 · ~$400B committed |
| xAI / SpaceX | Hyperscaler | 1,600 | Colossus 2, Memphis TN · 550K-1M GPUs · Colossus 1 leased to Anthropic | Grok training · Vera Rubin roadmap |
| Anthropic | Hyperscaler | 5,300 | Project Rainier (AWS) · Colossus 1 (SpaceX) · Fluidstack | Multi-vendor compute · 5 stacked deals |
| CoreWeave | Neocloud | 1,300 | Plano, TX · multiple sites | GPU-as-a-service, NVIDIA partner |
| Equinix | Colocation | 1,500 | Global · 260+ data centers | Interconnection + colo |
| Digital Realty | Colocation | 2,700 | Global · 300+ data centers | Wholesale + hyperscale colo |
| Lambda | Neocloud | 200 | Allen, TX · expanding | GPU cloud, on-demand H100/H200 |
| Crusoe Energy | Neocloud | 1,200 | Abilene, TX (Stargate Phase 1) | Stranded-gas powered AI infra |
MW figures are publicly disclosed AI-dedicated capacity, current or planned. Updated continuously from press releases, permit filings, and infrastructure analysis.
Top 8 DC operators by mention growth this week. Sourced from the Compute Lab brief.
The Data Center Designer simulator. Pick scale (1 MW → 2.5 GW), GPU (H100 / B200 / GB200 NVL72 / MI300X / TPU / Trainium), cooling, location, tier — see the real capex, opex, PUE, training throughput, build timeline, and CO₂ math your design produces. Six presets matched to real projects (Stargate, Hyperion, Project Rainier).
Deep dives on the real gigawatt-scale projects — verified specs, cited sources, strategic analysis.
Every tool is interactive, browser-only, no signup. Built to teach by doing.
🛠️
Pick scale + GPU + cooling + location. See live capex/opex/PUE.
🔧
20 cities. Real climate data. Realistic PUE/WUE/CUE.
⚡
Train Llama-405B in 90 days. Can you fit it?
🚨
3 AM page. Coolant leak. What do you do?
🧮
Power vs cooling vs space — find the bottleneck.
🆚
H100 vs B200 vs MI300X vs Trainium2 — real specs.
We have tailored reading paths for 4 audiences. Pick yours.
From PUE to NVLink — the vocabulary you need to read any data center paper.
Total facility power ÷ IT equipment power. 1.0 = perfect, 1.10 = hyperscale, 1.5+ = enterprise.
Coolant flows through cold plates touching the chip. Required for Blackwell-class racks above 70kW.
5th gen on Blackwell: 1.8 TB/s bidirectional per GPU. Connects GPUs into a single-memory domain.
NDR = 400 Gbps, XDR = 800 Gbps per port. Dominates scale-out networks for AI training.
Stacked DRAM next to the GPU die. H100 = HBM3 (3.35 TB/s), B200 = HBM3e (8 TB/s).
% of theoretical peak FLOPs your training run actually achieves. 50%+ is great.
Heat exchanger between the rack's liquid loop and the facility loop. Sits at row or rack level.
Fault-tolerant: every component is redundant + concurrently maintainable. 99.995% uptime target.
72 B200 GPUs + 36 Grace CPUs in one liquid-cooled rack. ~120 kW. 1.4 EFLOPS FP4.
Real-world courses and certifications used by the people building the largest AI clusters on Earth.
The de-facto entry credential for data center facilities. EXIN-accredited, valid 3 years. 40-question exam (27/40 to pass). Delivered in 50+ countries via partners.
The credential for designing Tier-rated facilities. PE licence (or equivalent) required. What hyperscalers and MEP firms actually look for.
CUDA, multi-node training, NCCL, Base Command. Paid courses include live GPU labs. Maps to NCA-AIIO ($125) and NCP-AII ($400) certifications.
BICSI's data-center–specific design credential. 100 questions, drag-and-drop + multiple choice. Requires RCDD or 3 years DC experience. Pearson VUE delivered.
200+ vendor-neutral modules on power, cooling, racks, design, sustainability. CPD-accredited. Optional DCCA (Data Center Certified Associate) exam.
Official learning platform for Open Compute Project specs. Modules include 'Open Systems for AI' (6-part series), Open Rack ORv3, and OCP-Recognized Equipment.
Compute Lab is the steel. Click through to see what runs on it.
Weekly: only the technical news that matters. New papers, MW build-outs, topology decisions, hardware drops.