gpu infrastructure

30 articles about gpu infrastructure in AI news

Nvidia Vera Rubin Shifts AI Strategy Beyond Raw GPU Speed

Nvidia's Vera Rubin architecture pivots from raw GPU FLOPS to system-level AI infrastructure, targeting memory bandwidth and interconnect bottlenecks that constrain large-scale model training.

Jul 28, 202691% relevant

The 'pytorch_no_powerplant_blowup' env var keeping frontier GPUs alive

A tweet reveals a PyTorch env var that throttles GPU power to prevent grid failures, exposing the real-world constraints of frontier AI infrastructure.

Jul 27, 202675% relevant

95% of Announced Nvidia Blackwell GPUs Yet to Deploy

95% of announced Nvidia Blackwell GPUs remain undeployed per Air Street Capital, signaling a gap between orders and infrastructure.

Jul 9, 202698% relevant

Upscale AI Raises $190M for AI Networking Infrastructure

Upscale AI raised $190M to expand AI networking infrastructure, addressing the bottleneck of 100K+ GPU clusters.

Jun 22, 202695% relevant

Nvidia Networking Revenue Hits $14.8B, Up 199% as AI Spending Shifts Beyond GPUs

Nvidia's Q1 FY2027 networking revenue surged 199% to $14.8B, signaling AI infrastructure spending is moving beyond GPUs into full-system networking. New reporting splits into Hyperscale and ACIE segments reflect a broadening customer base beyond hyperscalers.

May 21, 2026100% relevant

CoreWeave, Nebius Earnings Show AI Race Shifts From GPUs to Power

CoreWeave and Nebius Q1 earnings show AI infrastructure race shifting from GPU supply to power and scale, with combined capex guidance exceeding $55B.

May 15, 202690% relevant

Moore Threads Q1 Revenue Up, Building 100K-GPU AI Cluster

Moore Threads reports Q1 2026 revenue growth and confirms progress building a 100,000-GPU cluster for AI training, signaling growing domestic AI infrastructure in China despite US export controls.

Apr 27, 202674% relevant

DARPA Leases 50 Nvidia H100 GPUs for Biological AI Program

DARPA's Biological Technologies Office is procuring 50 Nvidia HGX H100 GPU systems for its NODES program, with hardware delivery required within one month. This represents a significant government investment in AI infrastructure for biological research applications.

Apr 22, 202686% relevant

CoreWeave & Google Raise $6.7B in Junk Bonds for AI Infrastructure

Google and GPU cloud provider CoreWeave have jointly raised $6.7 billion through a junk bond offering, with Google taking $5.7 billion. The capital is earmarked for a significant build-out of AI data center infrastructure.

Apr 20, 202695% relevant

Google's 5M H100-Equivalent GPU Fleet Powers Anthropic's AI Expansion

An analyst estimates Google's compute capacity at ~5 million Nvidia H100-equivalent GPUs, providing the infrastructure backbone for Anthropic's model deployment and growth. This highlights the strategic shift where foundational AI labs rely on hyperscaler scale for distribution.

Apr 7, 202685% relevant

Fine-Tuning an LLM on a 4GB GPU: A Practical Guide for Resource-Constrained Engineers

A Medium article provides a practical, constraint-driven guide for fine-tuning LLMs on a 4GB GPU, covering model selection, quantization, and parameter-efficient methods. This makes bespoke AI model development more accessible without high-end cloud infrastructure.

Apr 2, 2026100% relevant

Mistral Secures $830M Debt to Build Paris Data Center with 14,000 Nvidia GB300 GPUs

French AI startup Mistral has raised $830 million in debt financing to build and operate a sovereign AI data center near Paris, set to host nearly 14,000 Nvidia GB300 GPUs. The move signals a strategic European push for bespoke AI infrastructure, distinct from the gigawatt-scale builds of US hyperscalers.

Mar 30, 202690% relevant

Sparton: A New GPU Kernel Dramatically Speeds Up Learned Sparse Retrieval

Researchers propose Sparton, a fused Triton GPU kernel for Learned Sparse Retrieval models like Splade. It avoids materializing a massive vocabulary-sized matrix, achieving up to 4.8x speedups and 26x larger batch sizes. This is a core infrastructure breakthrough for efficient AI-powered search.

Mar 27, 202672% relevant

Yotta Data Services Seeks $4B Valuation in Pre-IPO Round, Expands India's Largest Nvidia GPU Cluster

Indian data center operator Yotta is raising $500-600M at a ~$4B valuation ahead of an IPO. The firm is scaling its Nvidia H100 and Blackwell (B200/B300) GPU fleet to position itself as a domestic AI infrastructure alternative.

Mar 20, 202679% relevant

Meta's GCM: The Unseen Infrastructure Revolution Powering Next-Gen AI

Meta AI has open-sourced GCM, a GPU cluster monitoring system that standardizes telemetry for massive AI training clusters. This infrastructure tool addresses the critical reliability challenges of trillion-parameter models by providing granular hardware insights.

Feb 25, 202675% relevant

Moonshot AI Pauses K3 Subscriptions as Demand Exceeds GPU Capacity

Moonshot AI paused Kimi K3 subscriptions due to GPU capacity limits. The open-weight release by July 27 aims to offload compute demand.

Jul 20, 2026100% relevant

CEO of Top AI Company Admits Begging for GPUs: Shortage Is Structural

The CEO of the most valuable AI company admitted begging for GPUs, signaling a structural shortage. The confession, reported by @TheGeorgePu, contradicts vendor narratives of ample supply.

Jul 18, 202675% relevant

Crusoe Launches Serverless Fine-Tuning, Targets AI Lifecycle Beyond GPUs

Crusoe launched serverless fine-tuning and inference, targeting enterprise AI teams. IDC says GPU access is no longer the differentiator; portability is now a procurement requirement.

Jul 10, 202675% relevant

DeepSeek, Zhipu AI Build Custom Inference Chips to Cut GPU Dependency

DeepSeek and Zhipu AI are developing custom inference chips to cut GPU costs. China's domestic chip budget share hit 46% in July 2026.

Jul 8, 2026100% relevant

Nvidia Renting Back GPU Capacity from Neoclouds Signals Demand Softening

Nvidia renting back GPU capacity from neoclouds signals demand softening. Analyst @edzitron claims the market cannot absorb current supply.

Jul 2, 202695% relevant

NHN Cloud Tops Korean TOP500 with FactoryX GPU Clusters

NHN Cloud tops Korean TOP500 with FactoryX GPU clusters delivering 1.2 exaflops, marking first domestic cloud provider to lead the list.

Jun 26, 202693% relevant

Colossus 2: xAI's Memphis Cluster Hits 300,000 GPUs

xAI's Colossus 2 hits 300,000 GPUs, targeting 1M by year-end. Training Grok-3, the $6B cluster challenges OpenAI and Google.

Jun 24, 202698% relevant

AI Infrastructure Hit $300B in 2025, Forecast to Exceed $520B by 2030

AI infrastructure spending hit $300B in 2025, up 60.1% YoY, and is forecast to exceed $520B by 2030, with sovereign AI emerging as the fastest-growing segment.

Jun 17, 202690% relevant

KKR Launches $10B Helix to Solve AI Infrastructure's Power-Land-Compute Bottleneck

KKR and three heavyweight partners — Nvidia, the Kuwait Investment Authority, and Vistra — launched Helix Digital Infrastructure on June 10, 2026, with more than $10 billion in committed capital. Former AWS CEO Adam Selipsky, who stepped down from Amazon in June 2024 and joined KKR as a senior advis

Jun 17, 202695% relevant

TensorWave Raises $350M Series B for AMD-Powered GPU Clusters

TensorWave raised $350M Series B for AMD-powered GPU clusters in North America, challenging Nvidia's dominance.

Jun 11, 202678% relevant

NHN Deploys 7,656-GPU AI Cluster in Seoul

NHN launched a 7,656-GPU cluster in Seoul, South Korea, for domestic enterprise AI workloads. The cluster targets inference and training, competing with Naver and Kakao.

May 13, 202690% relevant

Detecting AI Images: Metadata Exposes Generators, No GPU Needed

AI image detection via metadata analysis exposes generators like Google's Gemini and Meta's Llama without GPU clusters, highlighting a simple but effective method.

May 10, 202675% relevant

AMD Launches PCIe GPU for AI Workloads, Targets Existing Server Install Base

AMD launched a PCIe-based GPU for AI workloads, targeting existing servers. The card provides immediate boost without new data center buildouts.

May 8, 202690% relevant

Kunluncore Files STAR Market IPO, Claims 32K GPU Cluster First

Kunluncore filed for a STAR Market IPO, claiming a 32K GPU cluster first, testing investor appetite for domestic AI chips.

May 8, 202685% relevant

NVIDIA, DOE Build 100K-GPU Supercomputer for Science

DOE and NVIDIA announced Solstice, a 100K-GPU Vera Rubin supercomputer delivering 5,000 exaflops, and Equinox with 10K Blackwell GPUs.

May 7, 202680% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety