How does the MI350P compare to Nvidia's H200 NVL?

AMD claims 20% better FP64, 43% better FP16, and 39% better FP8 theoretical compute, using a newer architecture and 144GB HBM3E.

Can the MI350P be used in existing servers?

Yes, it is a drop-in upgrade for air-cooled servers with PCIe slots, configurable at 450W or 600W TDP to match thermal constraints.

Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

Listen

AMD MI350P PCIe AI accelerator card with dual-slot design and large heatsink, positioned alongside Nvidia H200 NVL…

Products & LaunchesScore: 78

AMD MI350P PCIe Card Claims 40% FP8 Lead Over Nvidia H200 NVL

AMD launched MI350P PCIe AI card with 144GB HBM3E, claiming 39% FP8 lead over Nvidia H200 NVL. Targets drop-in air-cooled server upgrades.

AAAla SMITH & AI Research Desk·2h ago·3 min read··2 views·AI-Generated·Report error

Source: tomshardware.comvia tomshardwareSingle Source

What are the specs and performance claims of AMD's new MI350P PCIe AI accelerator card?

AMD launched the MI350P PCIe AI accelerator with 144GB HBM3E, claiming 39% faster FP8 theoretical compute than Nvidia's H200 NVL. It packs 128 CUs, 8,192 cores, and 4TB/s bandwidth in a 600W fanless dual-slot card for existing air-cooled servers.

TL;DR

AMD launches MI350P PCIe AI accelerator with 144GB HBM3E. · Claims 39% FP8 theoretical compute lead over Nvidia H200 NVL. · Half the cores of MI355X, targets drop-in air-cooled server upgrades.

AMD launched the MI350P PCIe AI accelerator at CES 2026, claiming 39% faster FP8 theoretical compute than Nvidia's H200 NVL. The 600W dual-slot card packs 144GB of HBM3E and targets drop-in upgrades for existing air-cooled servers.

Key facts

MI350P: 8,192 cores, 128 CUs, 144GB HBM3E, 4TB/s bandwidth.
Claims 39% faster FP8 theoretical compute than Nvidia H200 NVL.
600W TDP, fanless dual-slot, drop-in for air-cooled servers.
Built on TSMC 3nm/6nm, CDNA4 architecture.
Nvidia has no PCIe Blackwell GPU announced.

AMD's new Instinct MI350P is a PCIe Gen5 AI accelerator built on the CDNA4 architecture, fabricated on TSMC's 3nm and 6nm FinFET process. It packs 8,192 cores, 128 Compute Units, 512 Matrix Cores, and a 2.2 GHz max clock. Memory consists of 144GB HBM3E with 4TB/s bandwidth and a 128MB last-level cache. The card supports native MXFP6 and MXFP4 precision to accelerate LLM inference [per Tom's Hardware].

The Drop-In Advantage

The MI350P is explicitly designed as a drop-in upgrade for existing air-cooled servers. At 10.5 inches and dual-slot, it uses a fanless design relying on chassis fans, configurable between 450W and 600W TDP. Up to eight cards can be paired in a single system, targeting small-to-large inference and RAG workloads [according to the source].

Benchmark Positioning

AMD claims the MI350P delivers 2,299 TFLOPs (FP16) and 4,600 peak TFLOPs (MXFP4). Compared to Nvidia's H200 NVL, AMD says the MI350P offers 20% better FP64, 43% better FP16, and 39% better FP8 theoretical compute. The card's specs are exactly half of AMD's flagship MI355X, which has 256 CUs and 288GB HBM3E [per the company's blog post].

The Unique Take: PCIe as a Strategic Wedge

Nvidia has not announced a PCIe version of its B200 Blackwell GPUs, leaving the PCIe AI accelerator segment open. AMD's MI350P fills that gap with a newer architecture and competitive memory capacity. The strategic play is not just raw FLOPs but compatibility: data centers can swap H200 NVL cards for MI350Ps without re-engineering server racks or cooling. If ROCm software maturity has improved enough — as AMD claimed at CES — this could be the first credible enterprise PCIe alternative to Nvidia in years.

Competitive Landscape

Nvidia's H200 NVL remains the incumbent PCIe AI accelerator, but it uses the older Hopper architecture. Nvidia's Blackwell generation (B200) is available only in SXM and NVL form factors, not PCIe. AMD's MI350P thus claims the "fastest enterprise PCIe card" title by default in its segment, though Nvidia could respond with a Blackwell PCIe variant [per Tom's Hardware]. AMD competes with Nvidia across the broader AI accelerator market, as noted in our coverage of the inference shift opening doors for chip startups.

What to watch

Watch for Nvidia's response: a potential Blackwell PCIe variant (e.g., B200 PCIe) could undercut AMD's advantage. Also track ROCm adoption benchmarks in enterprise inference deployments over the next two quarters — software maturity will determine whether MI350P gains real market share.

Lisa Su at CES 2026

Sources cited in this article

Tom's Hardware
PCIe Blackwell GPU

Source: gentic.news · 2h ago · author=Ala SMITH · citation.json

AI-assisted reporting. Generated by gentic.news from 2 verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

The MI350P is a tactical product: it fills a gap Nvidia has left open by not offering Blackwell in PCIe form. The 39% FP8 claim is theoretical peak, not real-world throughput, but the memory bandwidth and capacity are competitive. The key risk is ROCm software maturity — AMD has promised improvements but enterprise buyers remain skeptical. If ROCm can match CUDA for common inference frameworks (vLLM, TensorRT-LLM), the MI350P could be the first credible AMD PCIe alternative. The drop-in compatibility is a strong selling point for data centers that want to avoid forklift upgrades. Nvidia's silence on a Blackwell PCIe SKU is notable — they may be waiting to see if AMD gains traction before responding.

#amd #ai hardware #nvidia #gpus

Compare side-by-side

Nvidia vs AMD

→

Mentioned in this article

AMD Nvidia MI350P H200 NVL TSMC CDNA4

Enjoyed this article?

Get the weekly AI intelligence briefing

✨AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

Opinion & Analysis3 shared topics

MLPerf 6.0: NVIDIA Sweeps New Benchmarks, AMD MI355X Within 30% on Select Tests

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

Original research · EUMAS 2026

MNEMA — A Witness Lattice for Multi-Agent AI Memory

Cryptographic memory units · 1−α detection floor · 15 pp PDF

Field framework · v1.0

Epistemic Infrastructure

12 pillars · 11-stage knowledge metabolism · pathology catalog

AMD MI350P PCIe Card Claims 40% FP8 Lead Over Nvidia H200 NVL

What to watch

Sources cited in this article

AI Analysis

✨AI Toolslive

Related Articles

Jensen Huang's 30-Year TSMC Battle: From 3D Graphics to AI GPUs

Meta Deploys Millions of Amazon Graviton CPUs for AI Agents

UALink 2.0 Spec Finalized, Aims to Challenge NVLink for AI Clusters

AMD Backs UALink Open Interconnect to Challenge NVIDIA NVLink in AI

Aehr Test Systems Lands $41M AI Chip Order; H2 Bookings Top $92M

MLPerf 6.0: NVIDIA Sweeps New Benchmarks, AMD MI355X Within 30% on Select Tests

The framework underneath this story

More in Products & Launches

mlx-vlm v0.5.0 Adds Continuous Batching, Distributed Inference for Apple Silicon

OpenAI Trial Reveals Brockman's $1B Journal Entry, $30B Net Worth