gpu clusters

30 articles about gpu clusters in AI news

NHN Cloud Tops Korean TOP500 with FactoryX GPU Clusters

NHN Cloud tops Korean TOP500 with FactoryX GPU clusters delivering 1.2 exaflops, marking first domestic cloud provider to lead the list.

Jun 26, 202689% relevant

TensorWave Raises $350M Series B for AMD-Powered GPU Clusters

TensorWave raised $350M Series B for AMD-powered GPU clusters in North America, challenging Nvidia's dominance.

Jun 11, 202678% relevant

Detecting AI Images: Metadata Exposes Generators, No GPU Needed

AI image detection via metadata analysis exposes generators like Google's Gemini and Meta's Llama without GPU clusters, highlighting a simple but effective method.

May 10, 202675% relevant

Upscale AI Raises $190M for AI Networking Infrastructure

Upscale AI raised $190M to expand AI networking infrastructure, addressing the bottleneck of 100K+ GPU clusters.

Jun 22, 202695% relevant

Google and Blackstone Launch TPU Venture, Challenging Nvidia Dominance

Google and Blackstone launched a TPU venture, financing AI infrastructure outside the hyperscale cloud model. Enterprise buyers get a standalone alternative to Nvidia-dominated GPU clusters.

May 21, 202685% relevant

Cerebras' Strategic Partnership Yields Breakthrough AI Training Results

Cerebras Systems' partnership with Abu Dhabi's G42 has produced remarkable AI training benchmarks, achieving results 100x faster than traditional GPU clusters. The collaboration demonstrates the viability of wafer-scale computing for large language model development.

Feb 20, 202685% relevant

Cisco Reveals Scale-Across GPU Networking Needs 14x DCI Bandwidth

Cisco's chief architect detailed the massive bandwidth requirements for connecting AI clusters via 'scale-across' GPU networking, which needs 14x the capacity of traditional data center interconnects. This shift is creating a multi-billion dollar market for 800G coherent pluggables and deep-buffered switches.

Apr 21, 202685% relevant

Colossus 2: xAI's Memphis Cluster Hits 300,000 GPUs

xAI's Colossus 2 hits 300,000 GPUs, targeting 1M by year-end. Training Grok-3, the $6B cluster challenges OpenAI and Google.

Jun 24, 202698% relevant

Cerebras Hits 981 Tokens/sec on 1T-Parameter Kimi K2.6, Claims 6.7× GPU Cloud Speedup

Cerebras reported 981 tokens/sec on the 1T-parameter Kimi K2.6 model, a 6.7× speedup over the next GPU cloud, validated by an independent third party.

May 23, 202693% relevant

Nvidia Networking Revenue Hits $14.8B, Up 199% as AI Spending Shifts Beyond GPUs

Nvidia's Q1 FY2027 networking revenue surged 199% to $14.8B, signaling AI infrastructure spending is moving beyond GPUs into full-system networking. New reporting splits into Hyperscale and ACIE segments reflect a broadening customer base beyond hyperscalers.

May 21, 2026100% relevant

vLLM Optimizations Cut Voice AI Latency by 40% on 6-GPU Cluster

vLLM optimizations on a 6-GPU cluster reduced voice AI latency by 40% for a Qwen-based system, enabling 500 concurrent sessions per node without hardware upgrades.

May 16, 202682% relevant

CoreWeave, Nebius Earnings Show AI Race Shifts From GPUs to Power

CoreWeave and Nebius Q1 earnings show AI infrastructure race shifting from GPU supply to power and scale, with combined capex guidance exceeding $55B.

May 15, 202690% relevant

Cerebras IPO Challenges GPU Scaling Orthodoxy

Cerebras filed for IPO on April 21, betting wafer-scale chips can disrupt Nvidia's GPU cluster model for AI workloads.

May 14, 202698% relevant

MLX CUDA Backend Passes All Tests, Closing Apple GPU Gap

MLX CUDA backend passes all tests, enabling NVIDIA GPU support. Milestone bridges Apple Silicon and CUDA ecosystems for ML workloads.

May 13, 202677% relevant

NHN Deploys 7,656-GPU AI Cluster in Seoul

NHN launched a 7,656-GPU cluster in Seoul, South Korea, for domestic enterprise AI workloads. The cluster targets inference and training, competing with Naver and Kakao.

May 13, 202690% relevant

AMD Launches PCIe GPU for AI Workloads, Targets Existing Server Install Base

AMD launched a PCIe-based GPU for AI workloads, targeting existing servers. The card provides immediate boost without new data center buildouts.

May 8, 202690% relevant

Kunluncore Files STAR Market IPO, Claims 32K GPU Cluster First

Kunluncore filed for a STAR Market IPO, claiming a 32K GPU cluster first, testing investor appetite for domestic AI chips.

May 8, 202685% relevant

Anthropic's 220K GPU Cluster: $5B Compute Bet Revealed

Anthropic reportedly has 220K NVIDIA GPUs and 310MW, implying a >$5B compute cluster, 3x OpenAI's largest.

May 6, 2026100% relevant

OpenAI's MRC Protocol Sprays Packets Across 100+ Paths to Fix GPU Stragglers

OpenAI open-sourced MRC, a networking protocol that sprays packets across hundreds of paths to reduce GPU idle time from congestion and failures, contributed to OCP.

May 6, 202688% relevant

NVIDIA Open-Sources MRC, the RDMA Protocol Powering OpenAI's Blackwell Clusters

NVIDIA open-sourced MRC, a multi-path RDMA protocol used by OpenAI on Blackwell clusters, enabling microsecond rerouting across 64 paths.

May 6, 2026100% relevant

NVIDIA Feynman GPU Power Semi Content Hits $191K, 17× Blackwell

NVIDIA Feynman GPUs require $191K in power semiconductors per system, 17× Blackwell, driven by 800V DC architecture shift.

May 4, 202695% relevant

Open-Weight 1T Model Inference Margins Hit 88% on Rented GPUs

Renting a 128 GPU cluster to serve a 1T open model yields ~88% margin on tokens sold at $0.002/1K, exposing a structural arbitrage over proprietary APIs.

Apr 29, 202685% relevant

Moore Threads Q1 Revenue Up, Building 100K-GPU AI Cluster

Moore Threads reports Q1 2026 revenue growth and confirms progress building a 100,000-GPU cluster for AI training, signaling growing domestic AI infrastructure in China despite US export controls.

Apr 27, 202674% relevant

SemiAnalysis: NVIDIA's Customer Data Drives Disaggregated Inference, LPU Surpasses GPU

SemiAnalysis states NVIDIA's direct customer feedback is leading the industry toward disaggregated inference architectures. In this model, specialized LPUs can outperform GPUs for specific pipeline tasks.

Apr 22, 202685% relevant

DARPA Leases 50 Nvidia H100 GPUs for Biological AI Program

DARPA's Biological Technologies Office is procuring 50 Nvidia HGX H100 GPU systems for its NODES program, with hardware delivery required within one month. This represents a significant government investment in AI infrastructure for biological research applications.

Apr 22, 202686% relevant

Gur Singh Claims 7 M4 MacBooks Match A100, Calls Cloud GPU Training a 'Scam'

Developer Gur Singh posted that seven M4 MacBooks (2.9 TFLOPS each) match an NVIDIA A100's performance, calling cloud GPU training a 'scam' and advocating for distributed, consumer-hardware approaches.

Apr 18, 202677% relevant

A Practical Guide to Fine-Tuning an LLM on RunPod H100 GPUs with QLoRA

The source is a technical tutorial on using QLoRA for parameter-efficient fine-tuning of an LLM, leveraging RunPod's cloud H100 GPUs. It focuses on the practical setup and execution steps for engineers.

Apr 11, 202676% relevant

Intel, SambaNova Blueprint Pairs GPUs for AI Prefill, RDUs for Decoding

Intel and SambaNova Systems have outlined a new inference architecture for agentic AI workloads. It splits tasks between GPUs for 'prefill' and SambaNova's Reconfigurable Dataflow Units (RDUs) for high-throughput token generation.

Apr 8, 202685% relevant

Google's 5M H100-Equivalent GPU Fleet Powers Anthropic's AI Expansion

An analyst estimates Google's compute capacity at ~5 million Nvidia H100-equivalent GPUs, providing the infrastructure backbone for Anthropic's model deployment and growth. This highlights the strategic shift where foundational AI labs rely on hyperscaler scale for distribution.

Apr 7, 202685% relevant

Fine-Tuning an LLM on a 4GB GPU: A Practical Guide for Resource-Constrained Engineers

A Medium article provides a practical, constraint-driven guide for fine-tuning LLMs on a 4GB GPU, covering model selection, quantization, and parameter-efficient methods. This makes bespoke AI model development more accessible without high-end cloud infrastructure.

Apr 2, 2026100% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety