gpu clusters
30 articles about gpu clusters in AI news
NHN Cloud Tops Korean TOP500 with FactoryX GPU Clusters
NHN Cloud tops Korean TOP500 with FactoryX GPU clusters delivering 1.2 exaflops, marking first domestic cloud provider to lead the list.
TensorWave Raises $350M Series B for AMD-Powered GPU Clusters
TensorWave raised $350M Series B for AMD-powered GPU clusters in North America, challenging Nvidia's dominance.
Detecting AI Images: Metadata Exposes Generators, No GPU Needed
AI image detection via metadata analysis exposes generators like Google's Gemini and Meta's Llama without GPU clusters, highlighting a simple but effective method.
Upscale AI Raises $190M for AI Networking Infrastructure
Upscale AI raised $190M to expand AI networking infrastructure, addressing the bottleneck of 100K+ GPU clusters.
Google and Blackstone Launch TPU Venture, Challenging Nvidia Dominance
Google and Blackstone launched a TPU venture, financing AI infrastructure outside the hyperscale cloud model. Enterprise buyers get a standalone alternative to Nvidia-dominated GPU clusters.
Cerebras' Strategic Partnership Yields Breakthrough AI Training Results
Cerebras Systems' partnership with Abu Dhabi's G42 has produced remarkable AI training benchmarks, achieving results 100x faster than traditional GPU clusters. The collaboration demonstrates the viability of wafer-scale computing for large language model development.
Cisco Reveals Scale-Across GPU Networking Needs 14x DCI Bandwidth
Cisco's chief architect detailed the massive bandwidth requirements for connecting AI clusters via 'scale-across' GPU networking, which needs 14x the capacity of traditional data center interconnects. This shift is creating a multi-billion dollar market for 800G coherent pluggables and deep-buffered switches.
Colossus 2: xAI's Memphis Cluster Hits 300,000 GPUs
xAI's Colossus 2 hits 300,000 GPUs, targeting 1M by year-end. Training Grok-3, the $6B cluster challenges OpenAI and Google.
Cerebras Hits 981 Tokens/sec on 1T-Parameter Kimi K2.6, Claims 6.7× GPU Cloud Speedup
Cerebras reported 981 tokens/sec on the 1T-parameter Kimi K2.6 model, a 6.7× speedup over the next GPU cloud, validated by an independent third party.
Nvidia Networking Revenue Hits $14.8B, Up 199% as AI Spending Shifts Beyond GPUs
Nvidia's Q1 FY2027 networking revenue surged 199% to $14.8B, signaling AI infrastructure spending is moving beyond GPUs into full-system networking. New reporting splits into Hyperscale and ACIE segments reflect a broadening customer base beyond hyperscalers.
vLLM Optimizations Cut Voice AI Latency by 40% on 6-GPU Cluster
vLLM optimizations on a 6-GPU cluster reduced voice AI latency by 40% for a Qwen-based system, enabling 500 concurrent sessions per node without hardware upgrades.
CoreWeave, Nebius Earnings Show AI Race Shifts From GPUs to Power
CoreWeave and Nebius Q1 earnings show AI infrastructure race shifting from GPU supply to power and scale, with combined capex guidance exceeding $55B.
Cerebras IPO Challenges GPU Scaling Orthodoxy
Cerebras filed for IPO on April 21, betting wafer-scale chips can disrupt Nvidia's GPU cluster model for AI workloads.
MLX CUDA Backend Passes All Tests, Closing Apple GPU Gap
MLX CUDA backend passes all tests, enabling NVIDIA GPU support. Milestone bridges Apple Silicon and CUDA ecosystems for ML workloads.
NHN Deploys 7,656-GPU AI Cluster in Seoul
NHN launched a 7,656-GPU cluster in Seoul, South Korea, for domestic enterprise AI workloads. The cluster targets inference and training, competing with Naver and Kakao.
AMD Launches PCIe GPU for AI Workloads, Targets Existing Server Install Base
AMD launched a PCIe-based GPU for AI workloads, targeting existing servers. The card provides immediate boost without new data center buildouts.
Kunluncore Files STAR Market IPO, Claims 32K GPU Cluster First
Kunluncore filed for a STAR Market IPO, claiming a 32K GPU cluster first, testing investor appetite for domestic AI chips.
Anthropic's 220K GPU Cluster: $5B Compute Bet Revealed
Anthropic reportedly has 220K NVIDIA GPUs and 310MW, implying a >$5B compute cluster, 3x OpenAI's largest.
OpenAI's MRC Protocol Sprays Packets Across 100+ Paths to Fix GPU Stragglers
OpenAI open-sourced MRC, a networking protocol that sprays packets across hundreds of paths to reduce GPU idle time from congestion and failures, contributed to OCP.
NVIDIA Open-Sources MRC, the RDMA Protocol Powering OpenAI's Blackwell Clusters
NVIDIA open-sourced MRC, a multi-path RDMA protocol used by OpenAI on Blackwell clusters, enabling microsecond rerouting across 64 paths.
NVIDIA Feynman GPU Power Semi Content Hits $191K, 17× Blackwell
NVIDIA Feynman GPUs require $191K in power semiconductors per system, 17× Blackwell, driven by 800V DC architecture shift.
Open-Weight 1T Model Inference Margins Hit 88% on Rented GPUs
Renting a 128 GPU cluster to serve a 1T open model yields ~88% margin on tokens sold at $0.002/1K, exposing a structural arbitrage over proprietary APIs.
Moore Threads Q1 Revenue Up, Building 100K-GPU AI Cluster
Moore Threads reports Q1 2026 revenue growth and confirms progress building a 100,000-GPU cluster for AI training, signaling growing domestic AI infrastructure in China despite US export controls.
SemiAnalysis: NVIDIA's Customer Data Drives Disaggregated Inference, LPU Surpasses GPU
SemiAnalysis states NVIDIA's direct customer feedback is leading the industry toward disaggregated inference architectures. In this model, specialized LPUs can outperform GPUs for specific pipeline tasks.
DARPA Leases 50 Nvidia H100 GPUs for Biological AI Program
DARPA's Biological Technologies Office is procuring 50 Nvidia HGX H100 GPU systems for its NODES program, with hardware delivery required within one month. This represents a significant government investment in AI infrastructure for biological research applications.
Gur Singh Claims 7 M4 MacBooks Match A100, Calls Cloud GPU Training a 'Scam'
Developer Gur Singh posted that seven M4 MacBooks (2.9 TFLOPS each) match an NVIDIA A100's performance, calling cloud GPU training a 'scam' and advocating for distributed, consumer-hardware approaches.
A Practical Guide to Fine-Tuning an LLM on RunPod H100 GPUs with QLoRA
The source is a technical tutorial on using QLoRA for parameter-efficient fine-tuning of an LLM, leveraging RunPod's cloud H100 GPUs. It focuses on the practical setup and execution steps for engineers.
Intel, SambaNova Blueprint Pairs GPUs for AI Prefill, RDUs for Decoding
Intel and SambaNova Systems have outlined a new inference architecture for agentic AI workloads. It splits tasks between GPUs for 'prefill' and SambaNova's Reconfigurable Dataflow Units (RDUs) for high-throughput token generation.
Google's 5M H100-Equivalent GPU Fleet Powers Anthropic's AI Expansion
An analyst estimates Google's compute capacity at ~5 million Nvidia H100-equivalent GPUs, providing the infrastructure backbone for Anthropic's model deployment and growth. This highlights the strategic shift where foundational AI labs rely on hyperscaler scale for distribution.
Fine-Tuning an LLM on a 4GB GPU: A Practical Guide for Resource-Constrained Engineers
A Medium article provides a practical, constraint-driven guide for fine-tuning LLMs on a 4GB GPU, covering model selection, quantization, and parameter-efficient methods. This makes bespoke AI model development more accessible without high-end cloud infrastructure.