Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

AWS executive presenting EC2 G7 instance with NVIDIA Blackwell GPU on stage, highlighting 4.6x AI inference boost…

AWS Beats Cloud Rivals to NVIDIA Blackwell with EC2 G7 — 4.6x AI Inference Gain Over G6

AWS launched EC2 G7 instances on June 19, 2026, becoming the first major cloud to offer NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs. The instances claim 4.6x AI inference performance over G6, backed by 700 Gbps EFA networking and 32 GB GDDR7 per GPU. The move arrives the same week AWS confirme

·1d ago·5 min read··24 views·AI-Generated·Report error
Share:
Source: aws.amazon.comvia aws_infra, dcd_news, hpcwire, gn_gpu_cluster, gn_ai_data_centerWidely Reported
What are the performance improvements of AWS EC2 G7 instances over G6 instances?

AWS launched EC2 G7 instances with NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs, claiming up to 4.6x AI inference performance and 2.1x graphics performance over G6 instances, with 700 Gbps networking.

TL;DR

Amazon Web Services became the first major cloud to offer NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs, launching EC2 G7 instances on June 19 with up to 4.6x AI inference throughput, 700 Gbps networking, and 32 GB GDDR7 memory per chip.

Amazon Web Services on June 19, 2026 became the first major cloud provider to offer NVIDIA's latest-generation server GPUs, launching EC2 G7 instances powered by the RTX PRO 4500 Blackwell Server Edition. The instances claim up to 4.6x AI inference performance over the incumbent G6 family — a gap wide enough to make workload migration economically compelling for customers running inference at scale.

What the Hardware Actually Is

The RTX PRO 4500 Blackwell Server Edition is a single-slot, passively cooled card with 32 GB of GDDR7 ECC memory, 800 GB/s memory bandwidth, and 51 TFLOPS of FP32 compute. At 165 W, it slots into dense rack configurations that a dual-slot active-cooled card cannot reach. NVIDIA positions it as the successor to the L4, the workhorse of the G6 generation, with approximately 41% more CUDA cores (10,496 vs. 7,424) and 5th-generation Tensor Cores for the AI inference gains AWS is claiming.

By embedding the chip in its own instances before Google Cloud or Microsoft Azure, AWS captures the early-adopter window for customers who cannot wait for rival offerings.

Key Facts

  • 4.6x AI inference throughput over G6 instances (AWS claim; workload-specific)
  • 2.1x graphics performance over G6 for rendering and VDI
  • 32 GB GDDR7 per GPU, 1.33x G6 capacity; 2.45x memory bandwidth
  • 700 Gbps EFA networking, 7x more than G6 — critical for multi-node inference serving
  • Up to 8 GPUs per instance: 256 GB total GPU memory, 192 vCPUs, 768 GiB system RAM
  • 7.6 TB local NVMe SSD, 7 instance sizes from single-GPU to 8-GPU
  • Available now in US East (Ohio) and US West (Oregon); On-Demand, Savings Plans, and Spot purchasing
  • AWS did not disclose per-hour pricing at launch

Why the 700 Gbps Networking Number Matters

The raw GPU specs are expected given the Blackwell architecture. The more consequential figure may be the 700 Gbps Elastic Fabric Adapter throughput — a 7x jump over G6. Modern LLM inference serving distributes context across many GPUs; the bottleneck is frequently inter-GPU memory transfer, not raw compute. Sevenfold more bandwidth at the instance level directly raises the ceiling on model sizes G7 can serve without sharding across multiple instances, reducing both latency and cost per token.

The instances also support NVIDIA GPUDirect RDMA with EFA for Amazon FSx for Lustre, enabling GPU memory to communicate with distributed storage without routing through the CPU — a meaningful architecture for retrieval-augmented inference pipelines.

Industry Context: Blackwell Momentum Is Real

The G7 launch lands four days after MLCommons published MLPerf Training 6.0 results on June 16, in which NVIDIA Blackwell systems swept every benchmark, including a record 8,192-GPU scale-out run. The Blackwell GB300 NVL72 posted up to 60% faster training than the GB200 in the same rack configuration, and NVIDIA was the sole entrant on two new mixture-of-experts tests using DeepSeek-V3 (671 billion parameters) and GPT-OSS-20B. That benchmark validation gives enterprise buyers confidence the Blackwell generation is not merely a paper spec.

AWS's Two-Track Chip Strategy

The G7 launch cannot be read in isolation. One day before, reporting emerged that AWS is in active discussions to sell its own Trainium chips to external data centers — a significant strategic pivot confirmed by Amazon AI chief Peter DeSantis. Andy Jassy's April 2026 shareholder letter valued Amazon's semiconductor business at $50 billion in annualised revenue potential if sold externally, and noted commitments from OpenAI (approximately 2 gigawatts of Trainium capacity) and Anthropic (up to 5 gigawatts).

The juxtaposition is deliberate. AWS wants to be indispensable whether customers choose NVIDIA silicon or commodity alternatives. Offering Blackwell first cements the NVIDIA relationship; developing and potentially externalising Trainium creates a credible second-source that pressures NVIDIA pricing. Amazon separately confirmed it will deploy more than one million NVIDIA GPUs starting in 2026 — a figure that underscores the AI infrastructure market is large enough for both strategies to coexist.

Who Is Affected

The clearest beneficiaries are workloads that today saturate G6 memory or bandwidth: large multimodal inference, real-time video transcoding at 4K/8K, GPU-accelerated analytics on Amazon EMR and EKS, and virtual desktop infrastructure at enterprise scale. The 9th-generation NVENC engine with 4:2:2 H.264 and HEVC support makes G7 particularly relevant for media companies with broadcast-grade encoding requirements.

For enterprises currently on Reserved G6 Instances, the migration calculus depends on undisclosed G7 pricing. A 4.6x performance ratio only pays off at the workload level if the per-hour cost ratio is below that threshold — a number AWS has not yet provided.

What to Watch

Pricing disclosure and G7 Reserved Instance availability are the near-term catalysts; without a public on-demand rate, the 4.6x performance claim cannot be converted into a cost-per-inference comparison against G6 or against Azure and Google Cloud once they respond with their own Blackwell offerings.


Source: aws_infra, dcd_news, hpcwire, gn_gpu_cluster


Sources cited in this article

  1. GPU
Source: gentic.news · · author= · citation.json

AI-assisted reporting. Generated by gentic.news from 1 verified source, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

AWS's G7 launch is a tactical move to keep Nvidia's latest hardware on its platform while it develops Trainium as a long-term alternative. The 4.6x inference claim is aggressive but plausible given Blackwell's architectural improvements over Ada Lovelace (used in G6). The networking 7x jump to 700 Gbps is the more structural upgrade — it directly addresses the communication bottleneck for distributed inference and multi-node rendering workloads. The real test will be pricing: if AWS prices G7 competitively against on-prem Blackwell deployments, it could accelerate cloud migration for VFX and CAD workloads. The lack of disclosed pricing in the announcement is a red flag — likely means AWS is still calibrating against Nvidia's own DGX Cloud and competing cloud providers like Google Cloud and Azure, which will also offer Blackwell instances soon.
This story is part of
The AI Infrastructure War Shifts from Chips to Developer Tools
Nvidia's enterprise pivot and AWS's OpenAI bet collide with Cursor's quiet ascent
Compare side-by-side
Nvidia vs Amazon
Enjoyed this article?
Share:

AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

Related Articles

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

More in Products & Launches

View all