Case Study · AWS × Anthropic

Project Rainier — Anthropic's 2.2 GW Trainium Campus

The world's largest AI cluster using NOT NVIDIA silicon. AWS spent $11B on 1,200 acres of former farmland in New Carlisle, Indiana to build Anthropic a supercomputer of ~500,000 Trainium2 chips, scaling to 1M+ and 2.2 GW. Custom UltraServers replace NVLink. Anthropic engineers now write kernels against AWS Neuron SDK. The bet: frontier AI doesn't require NVIDIA.

Quick facts

Operator

AWS

Built for Anthropic (exclusive)

Location

New Carlisle, Indiana

St. Joseph County, near Lake Michigan

Site area

1,200 acres

Former farmland

Investment

~$11B

Largest capex in Indiana state history

Groundbreaking

September 2024

First 7 buildings live Oct 2025

Buildings (initial → full)

7 → 30

Each >200,000 sq ft

Chips live

~500K Trainium2

Targeting 1M+ by end 2026

Power target

2.2 GW

Current: several hundred MW

Cooling

Outside-air economized

WUE 0.15 L/kWh — excellent

1 · The bet: frontier AI without NVIDIA

In late 2024, Anthropic and Amazon signed a compute deal worth tens of billions. The novel part: it would run primarily on AWS Trainium chips, not NVIDIA GPUs. Every other frontier lab had locked itself into NVIDIA's stack (silicon + NVLink + Mellanox + CUDA + NCCL). Anthropic bet that AWS's Annapurna Labs could close the gap.

That bet is now deployed. Project Rainier's 500,000 Trainium2 chips, split across 7 buildings connected by AWS's Neuron fabric, deliver 5× the compute Anthropic used to train any previous Claude model(per AWS/Anthropic joint statements). Trainium3 is already being prepared for deployment in the same campus.

2 · The UltraServer — AWS's NVLink substitute

The hardest problem for a non-NVIDIA AI cluster is scale-up interconnect. NVIDIA has NVLink 5 (1.8 TB/s per GPU bidirectional). Without an equivalent, chips can't share memory and gradient sync becomes the bottleneck.

AWS's answer is the Trainium2 UltraServer: 4 physical servers × 16 Trainium2 chips each = 64 chips per UltraServer, stitched together by high-speed NeuronLinks. Each Trainium2 chip contains 2 compute tiles + 4 stacks of HBM3 memory on a base interposer.

UltraServers are then grouped into EC2 UltraClusters: multiple UltraServers on a single non-blocking network fabric. The specific fabric details AWS hasn't disclosed, but it's a custom interconnect in the same conceptual space as NVIDIA's NVL72 — stitch-up of chips into one memory domain.

3 · The multi-site architecture

Rainier-Indiana isn't a single cluster. AWS has connected multiple US data centers via ultra-high-bandwidth dedicated fiber to present Anthropic with one distributed supercomputer. This is a defensive move against the power + cooling bottleneck at any single campus.

Separately, Anthropic has a parallel deal with Google for up to 1 million TPUs, bringing an additional 1+ GW of compute online by end-2026 across Google's distributed DC network. Anthropic is now a genuinely multi-cloud AI factory — a first among frontier labs.

4 · Cooling — the Indiana advantage

New Carlisle sits on the southern shore of Lake Michigan — cold winters, mild summers, ample humidity. AWS maximized outside-air cooling and reports WUE of 0.15 L/kWh — about 10× better than the industry average and 40% better than AWS's own 2021 baseline.

The Trainium2 chip's thermal envelope (not publicly specified but estimated ~500W TDP) is more forgiving than B200's 1000W. This lets AWS use a less aggressive cooling stack than an equivalent NVIDIA deployment would require.

5 · Power — natural gas the compromise

The 2.2 GW target requires Indiana Michigan Power (part of AEP) to expand capacity significantly. The utility is acquiring a natural gas plant in Oregon, Ohio that will provide ~15% of its generation by end-2026, specifically to serve Rainier.

AWS secured a 50-year sales tax exemption on capital investment and an 85% personal property tax exemption from Indiana. Even so, state projections show AWS paying $722M+ in taxes over 35 years. Indiana Governor characterization: largest single private-sector investment in state history.

6 · Lessons from Project Rainier

Vertical integration beats the NVIDIA tax — if you can afford it. AWS is both chipmaker, cloud operator, and DC builder. Only hyperscalers at Amazon/Google/Microsoft scale can pull this off. Anthropic gets meaningfully cheaper FLOPs in exchange for software co-development.
Interconnect is the moat. Trainium2 silicon is arguably inferior per-chip to H100/H200 on some benchmarks. The UltraServer + NeuronLink system creates scale-up parity. The silicon is the commodity; the system is the differentiator.
Multi-site is the future. 2.2 GW at one campus is already hitting power-interconnect and local-opposition limits. Distributing across multiple sites + fiber is how frontier compute scales in 2027+.
Anthropic is multi-cloud. AWS + Google TPU + (reportedly) some NVIDIA usage. No one else has committed this hard to hardware diversity among frontier labs.

Source: AWS / Anthropic joint press releases (Oct 29 2025) · Amazon 10-K · AWS Neuron SDK documentation · Data Center Frontier analysis of UltraServer architecture · Indiana Economic Development Corporation filings.

Try this yourself

🆚 Trainium2 vs H200 🛠️ Designer · 2.2 GW scale 🔧 Indiana cooling climate