Lesson 03/12Intermediate14 min read·6 diagrams

Cooling Systems

Servers convert ~100% of the electricity they consume into heat. Above 30 kW per rack, air physically cannot move that heat fast enough — so every modern AI rack is liquid-cooled. This lesson covers the three loops, PUE/WUE, and where the heat actually goes.

1 · Why air cooling broke

Air can only carry about 1.2 kJ per cubic meter per °C. Above ~30–40 kW per rack, you'd need impractical airflow rates and aisle pressures. The thermal physics simply runs out.

Liquid water carries about 4.18 kJ per kg per °C — over 3,000× the heat capacity of air per unit volume. That's why every Blackwell-class deployment is liquid-cooled.

2 · The three cooling loops

Modern AI cooling is organized as three interconnected loops, each with a different fluid and a different job:

Three loops: chip → facility → environment. The CDU sits at the boundary between loops 1 and 2.

Cold plates and CDUs

A cold plate is a metal block with internal channels that bolts directly onto the chip die (or onto the heat spreader). Coolant flows through those channels, picks up heat, and returns to the CDU (Coolant Distribution Unit). The CDU is essentially a heat exchanger plus pumps that isolates the dirty/expensive chip loop from the building's facility loop.

Coolant supply temp

~25-32°C

ASHRAE W4 envelope

Coolant return temp

~45-55°C

After absorbing chip heat

Flow per GPU

~1.5 L/min

Typical for B200 cold plate

CDU capacity

500 kW–1.5 MW

One CDU serves a row of racks

3 · Air vs DLC vs immersion

Air cooling (CRAC / CRAH)

Computer Room Air Conditioner / Handler. Hot-aisle/cold-aisle containment, raised-floor plenums. Ceiling at ~30 kW/rack. Still used for non-AI workloads.

Direct Liquid Cooling (DLC)

Cold plates on chips, manifold per server, CDU per row. Handles 70–130 kW/rack. Standard for NVIDIA HGX H200, B200, GB200 NVL72 reference designs.

Single-phase immersion

Servers submerged in dielectric oil (Submer SmartPodX, GRC ICEraQ). Excellent thermals, reduces fans entirely. Niche due to maintenance complexity.

Two-phase immersion

Boiling-fluid cooling (3M Novec-class fluorocarbons). Highest density. Hit by 3M's PFAS phase-out announcement (2022) — adoption stalled.

4 · PUE and WUE — measuring efficiency

PUE (Power Usage Effectiveness) = total facility power ÷ IT power. A PUE of 1.10 means for every watt powering a GPU, the facility consumes 0.10 W on cooling, lights, and losses. Lower is better; 1.0 is theoretical perfection.

WUE (Water Usage Effectiveness) = liters of water consumed per kWh of IT energy. Evaporative cooling tower designs can hit 1.8 L/kWh; closed-loop dry cooling uses essentially zero water but trades higher PUE.

Source: The Green Grid / ASHRAE TC 9.9 standards; Uptime Institute Annual Global Data Center Survey.

5 · Heat reuse

All that heat doesn't have to be wasted. In Northern Europe, several large operators sell waste heat to district heating networks:

Stockholm — Bahnhof, Multigrid, others feed Stockholm Exergi's district network heating ~10,000 apartments.
Helsinki — Microsoft signed a deal in 2022 to provide up to 40% of Helsinki's district heat from a new DC.
Frankfurt — Multiple operators feed Mainova's network.

Outside cold-climate cities heat reuse is less viable — the temperature lift required is too high. Most US AI campuses simply reject heat to dry coolers or evaporation.

Lesson 03 — TL;DR

• Water carries 3,000× more heat per volume than air. Above 30–40 kW/rack, liquid is mandatory.
• Three loops: chip (cold plate) → facility (CDU) → environment (tower or dry cooler).
• PUE 1.10 is hyperscale-grade; 1.5+ is legacy.
• WUE measures water consumption; evap cooling trades water for power efficiency.
• Heat reuse is real in Northern Europe; impractical in most US sites.

Useful? Share so the next engineer learns this faster.