Lesson 01/12Beginner12 min read·4 diagrams

Data Center Fundamentals

What a data center actually is — physical building, power feed, cooling plant, networking gear, computers — and the four fundamental shifts that turned them from boring corporate basements into the most strategically important infrastructure of the AI era.

1 · What a data center actually is

A data center is a building purpose-built to keep computers running 24/7. Strip away the mystique and there are four physical things inside, in roughly equal importance:

  1. Power. A connection to the electric grid (often a dedicated substation), backed up by uninterruptible power supplies (UPS) and diesel or gas generators that can run the facility for days if the grid fails.
  2. Cooling. Servers turn ~100% of the electricity they consume into heat. A chiller plant, cooling towers, and either air handlers (CRAC/CRAH) or liquid loops carry that heat outside.
  3. Networking. Fiber from carriers enters at meet-me rooms, fans out through core/spine/leaf switches, and eventually plugs into each server.
  4. The IT itself. Racks of servers, storage, and switches. This is what the power and cooling exist to serve.
42U RackToR Switch (InfiniBand 400Gb)Spine Switch (uplink)GPU Server (8× H100, ~6kW)GPU Server (8× H100, ~6kW)GPU Server (8× H100, ~6kW)GPU Server (8× H100, ~6kW)Power Distribution Unit (PDU) — 400VCoolant Manifold (CDU return)NetworkingComputePower & CoolingTOTAL RACK~50 kW(traditional: 5-10 kW)
Anatomy of a single AI rack. A 42U cabinet today routinely draws 50–130 kW — about 10× more than the rack of email servers it replaced.

2 · The Tier classification

The most widely cited reliability framework comes from the Uptime Institute, a consultancy that has graded facility designs since 1995. Tiers I through IV measure what happens when something fails.

Uptime Institute Tier ClassificationTier IUPTIME99.671%DOWNTIME28.8h/yrREDUNDANCY
None
Tier IIUPTIME99.741%DOWNTIME22h/yrREDUNDANCY
Partial (N+1 components)
Tier IIIUPTIME99.982%DOWNTIME1.6h/yrREDUNDANCY
Concurrently maintainable
Tier IVUPTIME99.995%DOWNTIME26 min/yrREDUNDANCY
Fault tolerant (2N)
The "downtime per year" figures are ceiling targets, not guarantees. Achieving Tier IV is expensive — you need fully redundant power paths, mechanical paths, and the operational discipline to swap any component while live.

For AI workloads, Tier III is the de-facto floor — anything less and you lose training runs to unplanned downtime. Most hyperscale AI facilities are designed to Tier III or above, though hyperscalers often skip the formal Uptime certification because they've built their own equivalent specs (Open Compute Project standards).

Source: Uptime Institute, Tier Standard: Topology (current edition). See uptimeinstitute.com/tiers.

3 · The hierarchy: campus → U

When you read about "Stargate" or "Hyperion" or "Project Rainier", the term usually refers to a campus — the largest level of the hierarchy. Knowing the layers below helps you parse any infrastructure announcement.

Hierarchy: Campus → Building → Hall → Pod → Rack → UCampus
100MW+ site, multiple buildings, shared power & water
Building
20-100MW shell, dedicated mech & electrical
Data Hall
1-10MW, white space for IT racks
Pod
Group of 10-30 racks sharing CDU
Rack
42U or 48U cabinet, ~50-130kW
U
1.75 in / 44.45 mm slot
A "Stargate" or "Hyperion" usually refers to the Campus level — multiple buildings on one site.
Campus
100 MW+
Site-level, multiple buildings
Building
20–100 MW
Single shell, full mech/electrical
Data hall
1–10 MW
The 'white space' for IT racks
Pod
10–30 racks
Share a coolant distribution unit
Rack
50–130 kW
42U or 48U cabinet
U
44.45 mm
1 server slot

4 · AI changed everything

From roughly 2005 to 2020, data center design was a slow-moving discipline. Power densities crept up, cooling became more efficient, but the basic shape was stable.

Then came the GPU boom. A modern AI rack pulls 10× more power than the email-and-database racks it replaced. That single shift cascades through every other decision — cooling, electrical, networking, even site selection.

Traditional Enterprise DC vs Modern AI DCTraditional EnterpriseTotal power5-15 MWPer rack5-10 kWCoolingAir (CRAC + raised floor)Network10/25/100 GbE EthernetPUE (target)1.5-2.0WorkloadVM, web, DB, fileModern AITotal power100 MW – 2 GWPer rack70-130 kW (NVL72 ≈ 120 kW)CoolingDirect liquid (DLC)Network400/800G InfiniBand + NVLinkPUE (target)1.10-1.20WorkloadTraining + inference
Side-by-side. Every parameter in the right column is a direct consequence of GPU power density and the bandwidth needs of distributed training.

The four shifts that matter

1 · Power density exploded

A single NVIDIA GB200 NVL72 rack draws ~120 kW — more than 20 traditional racks combined. This forces dedicated busways, larger PDUs, and rethinking the whole electrical room.

2 · Air cooling broke down

Above ~30–40 kW per rack, air physically can't move heat away fast enough. Direct liquid cooling (DLC) — pumping fluid through cold plates touching each chip — became mandatory for Blackwell-class hardware.

3 · The network became a bottleneck

Training a frontier model means thousands of GPUs synchronizing gradients many times per second. Standard 100 GbE doesn't cut it — you need InfiniBand at 400 or 800 Gbps, or NVIDIA's NVLink fabric, with rail-optimized topology.

4 · Scale moved from MW to GW

Project Rainier (Anthropic on AWS) is announced at 2.2 GW. Stargate Phase 1 is targeting ~1.2 GW. Meta's Hyperion in Louisiana is planned for 2 GW. These numbers were unimaginable five years ago.

Source: Capacity figures: Amazon investor announcement (Nov 2025) for Project Rainier; Meta investor day and Reuters reporting for Hyperion Louisiana; Reuters / WSJ for Stargate Phase 1 (Abilene, TX).

5 · Vocabulary you must know after this lesson

Rack U
44.45 mm
One unit of vertical space
Power density
kW/rack
AI: 70-130, traditional: 5-10
PUE
Power Usage Eff.
Hyperscale ~1.10, enterprise ~1.5+
Data hall
White space
The room where IT racks live
Tier I-IV
Reliability
Uptime Institute classification
DLC
Direct Liquid Cooling
Required above ~70 kW/rack

Lesson 01 — TL;DR

  • • A data center has 4 physical components: power, cooling, networking, IT.
  • • Uptime Institute Tiers I–IV measure fault tolerance. Tier III ≈ AI floor.
  • • Hierarchy: Campus → Building → Hall → Pod → Rack → U.
  • • AI made racks ~10× more power-dense, forcing liquid cooling and faster networks.
  • • Stargate / Hyperion / Project Rainier are campuses in the 1–2 GW range.

Useful? Share so the next engineer learns this faster.

Share: