gpu infrastructure
30 articles about gpu infrastructure in AI news
Nvidia Networking Revenue Hits $14.8B, Up 199% as AI Spending Shifts Beyond GPUs
Nvidia's Q1 FY2027 networking revenue surged 199% to $14.8B, signaling AI infrastructure spending is moving beyond GPUs into full-system networking. New reporting splits into Hyperscale and ACIE segments reflect a broadening customer base beyond hyperscalers.
CoreWeave, Nebius Earnings Show AI Race Shifts From GPUs to Power
CoreWeave and Nebius Q1 earnings show AI infrastructure race shifting from GPU supply to power and scale, with combined capex guidance exceeding $55B.
Moore Threads Q1 Revenue Up, Building 100K-GPU AI Cluster
Moore Threads reports Q1 2026 revenue growth and confirms progress building a 100,000-GPU cluster for AI training, signaling growing domestic AI infrastructure in China despite US export controls.
DARPA Leases 50 Nvidia H100 GPUs for Biological AI Program
DARPA's Biological Technologies Office is procuring 50 Nvidia HGX H100 GPU systems for its NODES program, with hardware delivery required within one month. This represents a significant government investment in AI infrastructure for biological research applications.
CoreWeave & Google Raise $6.7B in Junk Bonds for AI Infrastructure
Google and GPU cloud provider CoreWeave have jointly raised $6.7 billion through a junk bond offering, with Google taking $5.7 billion. The capital is earmarked for a significant build-out of AI data center infrastructure.
Google's 5M H100-Equivalent GPU Fleet Powers Anthropic's AI Expansion
An analyst estimates Google's compute capacity at ~5 million Nvidia H100-equivalent GPUs, providing the infrastructure backbone for Anthropic's model deployment and growth. This highlights the strategic shift where foundational AI labs rely on hyperscaler scale for distribution.
Fine-Tuning an LLM on a 4GB GPU: A Practical Guide for Resource-Constrained Engineers
A Medium article provides a practical, constraint-driven guide for fine-tuning LLMs on a 4GB GPU, covering model selection, quantization, and parameter-efficient methods. This makes bespoke AI model development more accessible without high-end cloud infrastructure.
Mistral Secures $830M Debt to Build Paris Data Center with 14,000 Nvidia GB300 GPUs
French AI startup Mistral has raised $830 million in debt financing to build and operate a sovereign AI data center near Paris, set to host nearly 14,000 Nvidia GB300 GPUs. The move signals a strategic European push for bespoke AI infrastructure, distinct from the gigawatt-scale builds of US hyperscalers.
Sparton: A New GPU Kernel Dramatically Speeds Up Learned Sparse Retrieval
Researchers propose Sparton, a fused Triton GPU kernel for Learned Sparse Retrieval models like Splade. It avoids materializing a massive vocabulary-sized matrix, achieving up to 4.8x speedups and 26x larger batch sizes. This is a core infrastructure breakthrough for efficient AI-powered search.
Yotta Data Services Seeks $4B Valuation in Pre-IPO Round, Expands India's Largest Nvidia GPU Cluster
Indian data center operator Yotta is raising $500-600M at a ~$4B valuation ahead of an IPO. The firm is scaling its Nvidia H100 and Blackwell (B200/B300) GPU fleet to position itself as a domestic AI infrastructure alternative.
Meta's GCM: The Unseen Infrastructure Revolution Powering Next-Gen AI
Meta AI has open-sourced GCM, a GPU cluster monitoring system that standardizes telemetry for massive AI training clusters. This infrastructure tool addresses the critical reliability challenges of trillion-parameter models by providing granular hardware insights.
TensorWave Raises $350M Series B for AMD-Powered GPU Clusters
TensorWave raised $350M Series B for AMD-powered GPU clusters in North America, challenging Nvidia's dominance.
NHN Deploys 7,656-GPU AI Cluster in Seoul
NHN launched a 7,656-GPU cluster in Seoul, South Korea, for domestic enterprise AI workloads. The cluster targets inference and training, competing with Naver and Kakao.
Detecting AI Images: Metadata Exposes Generators, No GPU Needed
AI image detection via metadata analysis exposes generators like Google's Gemini and Meta's Llama without GPU clusters, highlighting a simple but effective method.
AMD Launches PCIe GPU for AI Workloads, Targets Existing Server Install Base
AMD launched a PCIe-based GPU for AI workloads, targeting existing servers. The card provides immediate boost without new data center buildouts.
Kunluncore Files STAR Market IPO, Claims 32K GPU Cluster First
Kunluncore filed for a STAR Market IPO, claiming a 32K GPU cluster first, testing investor appetite for domestic AI chips.
NVIDIA, DOE Build 100K-GPU Supercomputer for Science
DOE and NVIDIA announced Solstice, a 100K-GPU Vera Rubin supercomputer delivering 5,000 exaflops, and Equinox with 10K Blackwell GPUs.
Anthropic's 220K GPU Cluster: $5B Compute Bet Revealed
Anthropic reportedly has 220K NVIDIA GPUs and 310MW, implying a >$5B compute cluster, 3x OpenAI's largest.
OpenAI's MRC Protocol Sprays Packets Across 100+ Paths to Fix GPU Stragglers
OpenAI open-sourced MRC, a networking protocol that sprays packets across hundreds of paths to reduce GPU idle time from congestion and failures, contributed to OCP.
Nscale to Deploy 66K+ Rubin GPUs for Microsoft in Portugal
Nscale will deploy 66,000+ NVIDIA Rubin GPUs for Microsoft at Portugal's Start Campus. The deal is a first for Rubin and signals Microsoft's geographic diversification.
NVIDIA Feynman GPU Power Semi Content Hits $191K, 17× Blackwell
NVIDIA Feynman GPUs require $191K in power semiconductors per system, 17× Blackwell, driven by 800V DC architecture shift.
ODMs Evolve from Manufacturers to AI Infrastructure Partners
ODMs shift from manufacturing to design/integration partners for AI racks, driven by GPU/ASIC complexity and liquid cooling.
Cisco Reveals Scale-Across GPU Networking Needs 14x DCI Bandwidth
Cisco's chief architect detailed the massive bandwidth requirements for connecting AI clusters via 'scale-across' GPU networking, which needs 14x the capacity of traditional data center interconnects. This shift is creating a multi-billion dollar market for 800G coherent pluggables and deep-buffered switches.
Bull Delivers HPC Infrastructure to Power Mimer AI Factory
Bull, a subsidiary of Atos, has supplied the core HPC infrastructure for Mimer's new AI factory. This facility is dedicated to training and developing large language models for the European market.
LeWorldModel Solves JEPA Collapse with 15M Params, Trains on Single GPU
Researchers published LeWorldModel, solving the representation collapse problem in Yann LeCun's JEPA architecture. The 15M-parameter model trains on a single GPU and demonstrates intrinsic physics understanding.
Gur Singh Claims 7 M4 MacBooks Match A100, Calls Cloud GPU Training a 'Scam'
Developer Gur Singh posted that seven M4 MacBooks (2.9 TFLOPS each) match an NVIDIA A100's performance, calling cloud GPU training a 'scam' and advocating for distributed, consumer-hardware approaches.
Meta to Cut 8,000 Jobs in May, Redirecting Capital to AI Infrastructure
Meta is reportedly planning to lay off 8,000 employees in May, the first round of major cuts this year. The move signals a capital shift from general operations to concentrated investment in AI infrastructure like chips and data centers.
DOE Seeks Input on AI Infrastructure for Federal Lands
The U.S. Department of Energy has published a Request for Information (RFI) to solicit input on developing AI and high-performance computing infrastructure on DOE-owned lands. This marks a significant step in the federal government's strategy to directly address the national AI compute shortage.
Meta Deploys Unified AI Agents to Manage Hyperscale Infrastructure
Meta's engineering team has built and deployed a system of unified AI agents to autonomously manage capacity and performance across its hyperscale infrastructure. This represents a significant shift from rule-based automation to AI-driven orchestration for one of the world's largest computing fleets.
Claude MCP GPU Debugging: AI Agent Identifies PyTorch Bottleneck in Kernel
A developer used an AI agent powered by Claude Code and the Model Context Protocol (MCP) to diagnose a severe GPU performance bottleneck. The agent analyzed system kernel traces, pinpointing excessive CPU context switches as the culprit, demonstrating a practical application of agentic AI for complex technical debugging.