Curriculum
From zero to designing 100MW AI campuses
Twelve lessons. About 3 hours of focused reading. By the end, you can read any AI infrastructure paper, follow any data center earnings call, and have informed opinions on why xAI's Colossus, Stargate, and Project Rainier are different from each other.
Start here. The core concepts every data center engineer knows.
Data Center Fundamentals
What a data center actually is, the four-layer Tier classification (Uptime Institute), the components inside a single rack, and why AI changed everything.
Power Infrastructure
From the utility substation to the chip: high-voltage interconnects, UPS systems, generators, PDUs, and the 100MW+ scale that AI demands.
Cooling Systems
Air, liquid, immersion. CRAC vs CDU, direct-liquid cooling for GPUs, PUE/WUE math, and why every modern AI rack is liquid-cooled.
Compute, networking, storage, software — the technical heart.
Compute & Accelerators
NVIDIA H100/H200/B200/GB200 NVL72, AMD MI300X, Google TPU v5p, AWS Trainium2, Cerebras WSE-3. Real specs, real interconnects.
Network Fabric
InfiniBand vs Ethernet (Ultra Ethernet Consortium), NVLink/NVSwitch, optical transceivers, CLOS topology, and rail-optimized layouts.
Storage Architecture
Parallel filesystems (Lustre, WekaFS, VAST), NVMe-oF, checkpoint strategies, and how 100k-GPU clusters move 100GB/s.
Software & Orchestration
SLURM vs Kubernetes for AI, Run.AI, NVIDIA Base Command, gang scheduling, fault tolerance, and the orchestration stack on top of bare metal.
How facilities are designed, built, and run at scale.
How to Build One
Site selection, permitting, 18-36 month construction timelines, vendor selection, and the realistic capex of a 100MW AI campus.
Operating a Data Center
DCIM, BMS, capacity planning, incident response, the day-to-day of running a critical facility at 99.99% uptime.
Sustainability
PUE/WUE/CUE, hyperscaler net-zero pledges, geothermal partnerships, heat reuse for district heating, water positivity.
Economics, careers — the why and the path.
Economics & Financing
$/MW capex, opex breakdown, neocloud business models (CoreWeave, Lambda, Crusoe), depreciation cycles, and why CapEx is exploding.
Careers & How to Become an Expert
Roles, salaries, certifications (Uptime Institute CDCP/CDCS/CDCE, BICSI RCDD), training programs, and the career ladder.
Ready?
Start with Lesson 01 — Fundamentals. Each lesson builds on the last but you can also jump to any topic you need.
Start: 01 · Fundamentals