gpu shortage
30 articles about gpu shortage in AI news
AWS Never Retired an A100 Server, CEO Says Amid Chip Shortage
AWS CEO Matt Garman stated that A100 servers are completely sold out and never retired, as demand for older chips outpaces supply. This underscores the prolonged GPU shortage and the value of legacy hardware in cloud AI.
The Great GPU Scramble: How Hardware Shortages Are Defining the AI Arms Race
Oracle founder Larry Ellison identifies GPU acquisition as the primary bottleneck in AI development, with companies racing to secure limited hardware for breakthroughs in medicine, video generation, and autonomous systems.
Mac Studio AI Hardware Shortage Signals Shift to Cloud Rentals
Developers report a global shortage of high-memory Apple Silicon Macs, with 128GB Mac Studios unavailable worldwide. This pushes practitioners toward renting cloud H100 GPUs at ~$3/hr, marking a shift from the recent local AI trend.
DARPA Leases 50 Nvidia H100 GPUs for Biological AI Program
DARPA's Biological Technologies Office is procuring 50 Nvidia HGX H100 GPU systems for its NODES program, with hardware delivery required within one month. This represents a significant government investment in AI infrastructure for biological research applications.
AI Compute Crisis: GPU Prices Up 48%, Anthropic API at 98.95% Uptime
The AI industry faces a severe compute capacity crisis, with GPU prices up 48%, Anthropic API uptime falling to 98.95%, and OpenAI shutting down Sora to reallocate resources. Demand for agentic AI is outstripping supply, forcing rationing and product cancellations.
Google's 5M H100-Equivalent GPU Fleet Powers Anthropic's AI Expansion
An analyst estimates Google's compute capacity at ~5 million Nvidia H100-equivalent GPUs, providing the infrastructure backbone for Anthropic's model deployment and growth. This highlights the strategic shift where foundational AI labs rely on hyperscaler scale for distribution.
Jensen Huang Counters Musk's 'One Robot Per Person' Vision, Argues for Multiples to Address Labor Shortages
NVIDIA CEO Jensen Huang responded to Elon Musk's expectation of one robot per person, stating the need for 'more than 1' per person to address severe labor shortages and accelerate corporate growth.
AI Data Center HBM Shortage Intensifies as Samsung, SK Hynix, and Micron Struggle with Supply
AI data centers are aggressively stockpiling high-bandwidth memory (HBM), creating a supply crunch. Only three manufacturers—Samsung, SK Hynix, and Micron—can produce this critical component for AI servers.
AirTrain Enables Distributed ML Training on MacBooks Over Wi-Fi
Developer @AlexanderCodes_ open-sourced AirTrain, a tool that enables distributed ML training across Apple Silicon MacBooks using Wi-Fi by syncing gradients every 500 steps instead of every step. This makes personal device training feasible for models up to 70B parameters without cloud GPU costs.
DOE Seeks Input on AI Infrastructure for Federal Lands
The U.S. Department of Energy has published a Request for Information (RFI) to solicit input on developing AI and high-performance computing infrastructure on DOE-owned lands. This marks a significant step in the federal government's strategy to directly address the national AI compute shortage.
InCoder-32B-Thinking Hits 81.3% on LiveCodeBench, Trained on Chip & Kernel Traces
InCoder-32B-Thinking, a 32B parameter model trained on execution traces from chip design, GPU kernels, and embedded systems, scores 81.3% on LiveCodeBench V5 and an 84% compile pass rate on CAD-Coder.
Memory Market Squeeze Threatens iPhone Price Hikes as AI Demands Strain Supply
A global RAM shortage and price increases could force Apple to raise iPhone prices by up to $250, according to industry analysis. The tech giant is reportedly unwilling to absorb the cost, passing it directly to consumers amid surging memory demands from AI applications.
AI Gold Rush Strains Apple Hardware: High-Memory Macs Sell Out as Local AI Agents Go Mainstream
A surge in demand for local AI development has created severe inventory shortages for high-memory Apple hardware. Mac Studio orders with 128GB or 512GB RAM face 6+ week delays as consumers buy up every available unit to run powerful AI agents like OpenClaw.
AI Chip Capacity Crisis: 10GW Left Through 2030, Prices Up Double Digits
The AI accelerator market has only 10 gigawatts of capacity left for contract through 2030, with 100GW already under contract. Prices are rising double digits as one competitor has stopped taking orders entirely.
Microsoft's Fairwater AI Data Center Launches Early, Boosts Azure Capacity
Microsoft has launched its Fairwater AI data center ahead of schedule. The facility adds significant high-performance computing capacity to Azure's AI infrastructure, crucial for training and running large models.
Anthropic Hiring Data Center Leasing Principals in Europe & Australia
Anthropic is actively hiring for data center leasing roles in Europe and Australia, revealing a strategic push to build out its own compute infrastructure as it scales its AI models.
Anthropic's Adaptive Thinking: A Compute-Constrained Efficiency Play
Analysis suggests Anthropic's new 'adaptive thinking' feature is a direct response to compute constraints and competitive pressure from OpenAI, aiming to optimize token usage for enterprise clients at the potential cost of consumer experience.
Satellite Data Shows 40% of 2026 AI Data Centers at Risk of Delay
Geospatial analytics firm SynMax reports that at least 40% of AI data centers scheduled for 2026 completion are at risk of delays exceeding three months, based on satellite imagery analysis of construction progress at sites for OpenAI, Microsoft, and Oracle.
Canada's AI Compute Gap: Google Cloud Montreal Offers 2017-Era Chips
A technical developer's attempt to rent modern AI compute in Canada revealed a stark infrastructure gap, with major providers offering chips as old as 2017, undermining national AI ambitions.
Compute Constraints Create Double Bind for AI Growth: Ethan Mollick
Ethan Mollick highlights a critical industry bottleneck: compute scarcity forces a trade-off between raising prices/rationing current models and limiting future model training, creating a growth double bind.
AI Models Dumber as Compute Shifts to Enterprise, Users Report
Users report noticeable performance degradation in major AI models this month. Analysts suggest providers are shifting computational resources to prioritize enterprise clients over general subscribers.
Houthi Threat to Bab el-Mandeb Strains AI Chip Supply Chain
Escalating Middle East conflict threatens two key maritime chokepoints, Bab el-Mandeb and Hormuz, jeopardizing the helium and energy supplies that underpin global advanced AI chip manufacturing at TSMC and SK Hynix.
VMLOps Publishes 2026 AI Engineer Roadmap for Software Engineers
VMLOps published a comprehensive 2026 roadmap detailing the skills and knowledge software engineers need to transition into AI engineering. The guide reflects the current industry demand for engineers who can build and deploy production AI systems.
Altimeter's Gerstner: AI Economics Shift to Owned Compute for Fixed Costs
Altimeter Capital's Brad Gerstner states the fundamental economics of AI have flipped, where companies owning their compute infrastructure lock in fixed costs while AI-driven revenue scales, creating a powerful advantage.
Anthropic Considers Custom AI Chips, Following Google & OpenAI
Anthropic is reportedly considering developing custom AI chips, a strategic move to gain control over its compute infrastructure and reduce costs. This follows similar initiatives by Google, Amazon, and OpenAI.
Samsung Projects Record $14.6B Q1 Profit on 300% DRAM Price Surge
Samsung Electronics expects a record Q1 operating profit of 20 trillion won (~$14.6B), nearly triple YoY, fueled by soaring AI-driven demand and a 300% price increase for DRAM chips.
Anthropic Secures Multi-Gigawatt Google TPU Deal for Frontier Claude Models
Anthropic announced a multi-gigawatt agreement with Google and Broadcom for next-generation TPU capacity, coming online in 2027, to train and serve frontier Claude models.
TSMC 2nm Capacity Constraints Create Opening for Samsung in AI Chip Foundry Race
TSMC has reportedly hit a 'hard capacity wall' at its 2nm node, creating a strategic opportunity for Samsung Foundry to capture AI accelerator business from major clients like Nvidia and OpenAI. This bottleneck could reshape the competitive landscape for advanced semiconductor manufacturing.
Elon Musk Predicts 'Vast Majority' of AI Compute Will Be for Real-Time Video
Elon Musk states that real-time video consumption and generation will consume most AI compute, highlighting a shift from text to video as the primary medium for AI processing.
Data Center Construction Boom Drives Electrician Salaries to $260k, Fueled by AI Infrastructure Demand
Mike Rowe reports data center electricians earning $260,000/year without degrees as 25.3 GW of capacity is under construction in the Americas, with 89% pre-committed. The AI infrastructure buildout is creating a high-wage, skilled trades bottleneck.