Hopper is the codename for NVIDIA's GPU microarchitecture introduced in 2022 with the H100 accelerator, succeeding Ampere and preceding Blackwell. It is purpose-built for modern AI workloads, particularly large language models (LLMs) and generative AI. The H100 is fabricated on a TSMC 4N process and contains 80 billion transistors. Key technical innovations include the Transformer Engine, which dynamically selects between FP8 and FP16 precisions per layer to accelerate transformer models without sacrificing accuracy; fourth-generation NVLink and NVSwitch enabling up to 900 GB/s GPU-to-GPU bandwidth across 8 GPUs; second-generation Multi-Instance GPU (MIG) with up to 7 instances per GPU; and a dedicated DPX instruction set for dynamic programming algorithms. The H200 variant, announced in late 2023 and shipping in 2024, upgrades memory to 141 GB HBM3e with 4.8 TB/s bandwidth, providing a 1.7x memory bandwidth increase over H100. Hopper GPUs are deployed in clusters of thousands (e.g., Meta's 24,576 H100 cluster for Llama 3) and are the dominant infrastructure for training frontier models like GPT-4, Gemini, and Llama 3.1 405B. Compared to Ampere (A100), Hopper delivers roughly 3x training throughput for LLMs (NVIDIA's internal benchmarks) and up to 6x inference throughput with FP8. Common pitfalls include underestimating power and cooling requirements (700W TDP per H100, requiring liquid cooling at scale), the need for careful FP8 quantization calibration to avoid accuracy degradation, and the fact that Hopper's full performance requires NVLink-connected nodes (8-GPU DGX H100) rather than standard InfiniBand. As of 2026, Hopper remains widely deployed but is being succeeded by Blackwell (B100/B200), which offers ~2x Hopper's FP8 training throughput and new FP4 support; however, Hopper still dominates production inference due to established software stacks (CUDA 12, TensorRT-LLM, vLLM) and lower cost per token. Hopper is also the foundation for NVIDIA's H100 NVL (NVLink-connected pair) and H200 NVL, used in cloud instances (AWS p5, Azure ND H100 v5, GCP A3).
Hopper: definition + examples
Examples
- Meta trained Llama 3 405B on a cluster of 24,576 H100 GPUs using FSDP and 3D parallelism.
- OpenAI's GPT-4 training reportedly used ~25,000 H100 GPUs across multiple clusters.
- NVIDIA's H200 GPU delivers 4.8 TB/s HBM3e memory bandwidth, enabling inference of 70B-parameter models without tensor parallelism.
- Anthropic's Claude 3 Opus was trained on H100 clusters with custom infra (as per public statements).
- Microsoft Azure ND H100 v5 instances offer 8× H100 GPUs with 3.2 TB/s NVLink bandwidth per VM for distributed training.
Related terms
Latest news mentioning Hopper
- NVIDIA Blackwell Ultra Leads First Agentic AI Benchmark, 20x Agents/MW vs Hopper
NVIDIA Blackwell Ultra NVL72 leads the first AgentPerf benchmark for agentic AI, delivering 20x more agents per megawatt than Hopper.
Jun 12, 2026 - Visa ChatGPT Integration Enables AI Agent Retail Purchasing
Visa integrated with ChatGPT to let AI agents autonomously purchase retail goods. This enables conversational commerce where users delegate shopping to AI, with Visa handling secure payments.
Jun 11, 2026 - NVIDIA NVFP4 on Blackwell Cuts JAX Training by 1.8x in MaxText
NVIDIA NVFP4 on Blackwell achieves 1.8x training speedup over FP8 in JAX/MaxText with no claimed accuracy loss for models up to 70B, but larger-scale validation is needed.
Jun 8, 2026 - UK Doubles Sovereign AI Cloud Providers, Deploys 65MW Nebius Cluster
UK doubled sovereign AI cloud providers in a year. Nebius deploys 65MW cluster; Isambard-AI powers Sovereign AI Fund for homegrown startups.
Jun 8, 2026 - Shark Beauty drives 40% skin-care device growth with community-led
Shark Beauty's VP Julie Bailey Blanche revealed at Glossy's E-Commerce Summit that a community-driven, benefit-first marketing strategy drove 40% Q1 2026 skin-care growth. The approach prioritizes UGC
Jun 8, 2026
FAQ
What is Hopper?
Hopper is NVIDIA's GPU architecture (H100, H200) optimized for large-scale AI training and inference, featuring Transformer Engine (FP8), NVLink/NVSwitch, and up to 141 GB HBM3e memory.
How does Hopper work?
Hopper is the codename for NVIDIA's GPU microarchitecture introduced in 2022 with the H100 accelerator, succeeding Ampere and preceding Blackwell. It is purpose-built for modern AI workloads, particularly large language models (LLMs) and generative AI. The H100 is fabricated on a TSMC 4N process and contains 80 billion transistors. Key technical innovations include the Transformer Engine, which dynamically selects between…
Where is Hopper used in 2026?
Meta trained Llama 3 405B on a cluster of 24,576 H100 GPUs using FSDP and 3D parallelism. OpenAI's GPT-4 training reportedly used ~25,000 H100 GPUs across multiple clusters. NVIDIA's H200 GPU delivers 4.8 TB/s HBM3e memory bandwidth, enabling inference of 70B-parameter models without tensor parallelism.