Game · Power Budget
Constraints first. Excess optional.
You're a chief infrastructure officer. You have power. You have budget. You have a deadline. Pick a frontier-model training scenario and design a cluster that meets all three constraints. Score = win/lose with explanations.
🎯 Pick a scenario
Mission constraints
200 MW
Power available
$2.0B
Capex budget
90 days
Time limit
💡 405B ~10²⁵ FLOPs. At 40% MFU you need ~1.3 EFLOPS sustained for 90 days.
🏆 Mission accomplished
86% headroomPower used
28.8 MW
Total capex
$943M
Training time
4.3 d
Sustained EFLOPS
27.00
✓ Your 15,000 GB200 NVL72 (72-GPU rack) units in 209 racks deliver 67500 peak PFLOPS. At 40% MFU that finishes the run with 85.7 days to spare.
The trick: training time is governed by sustained FLOPs ÷ workload size. MFU (Model FLOPs Utilization) of 40% is realistic — code, communication, and restarts eat 60% of theoretical peak. Power follows GPU TDP × cooling PUE. Capex is dominated by silicon — bigger GPU counts → exponential cost. Win condition usually requires balancing: NOT just maximizing GPUs, but choosing efficient ones (B200 wins per-watt vs H100).