Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

cloud gpu

30 articles about cloud gpu in AI news

Gur Singh Claims 7 M4 MacBooks Match A100, Calls Cloud GPU Training a 'Scam'

Developer Gur Singh posted that seven M4 MacBooks (2.9 TFLOPS each) match an NVIDIA A100's performance, calling cloud GPU training a 'scam' and advocating for distributed, consumer-hardware approaches.

77% relevant

Cloud GPU vs. Colocation: H100 Costs $8k/Month on Google Cloud vs. $1k Colo

A technical founder highlights the stark economics: renting one H100 on Google Cloud costs ~$8,000/month, while the retail hardware is ~$30,000. At that rate, 4 months of cloud rental equals the cost of outright ownership, making colocation at ~$1k/month a compelling alternative for sustained AI workloads.

85% relevant

VS Code Now Connects Directly to Google Colab With Free T4 GPU

Google Colab integrates with VS Code, offering a free T4 GPU inside the editor, bypassing cloud GPU providers.

91% relevant

AirTrain Enables Distributed ML Training on MacBooks Over Wi-Fi

Developer @AlexanderCodes_ open-sourced AirTrain, a tool that enables distributed ML training across Apple Silicon MacBooks using Wi-Fi by syncing gradients every 500 steps instead of every step. This makes personal device training feasible for models up to 70B parameters without cloud GPU costs.

95% relevant

Cerebras Hits 981 Tokens/sec on 1T-Parameter Kimi K2.6, Claims 6.7× GPU Cloud Speedup

Cerebras reported 981 tokens/sec on the 1T-parameter Kimi K2.6 model, a 6.7× speedup over the next GPU cloud, validated by an independent third party.

93% relevant

IOWN Forum Pushes All-Photonic WAN for AI Neocloud Interconnects

The IOWN Global Forum is focusing its optical networking tech on datacenter interconnects, aiming to let GPU 'neoclouds' and financial firms use cheaper, remote facilities without latency penalties for AI workloads.

78% relevant

Mac Studio AI Hardware Shortage Signals Shift to Cloud Rentals

Developers report a global shortage of high-memory Apple Silicon Macs, with 128GB Mac Studios unavailable worldwide. This pushes practitioners toward renting cloud H100 GPUs at ~$3/hr, marking a shift from the recent local AI trend.

85% relevant

A Practical Guide to Fine-Tuning an LLM on RunPod H100 GPUs with QLoRA

The source is a technical tutorial on using QLoRA for parameter-efficient fine-tuning of an LLM, leveraging RunPod's cloud H100 GPUs. It focuses on the practical setup and execution steps for engineers.

76% relevant

Fine-Tuning an LLM on a 4GB GPU: A Practical Guide for Resource-Constrained Engineers

A Medium article provides a practical, constraint-driven guide for fine-tuning LLMs on a 4GB GPU, covering model selection, quantization, and parameter-efficient methods. This makes bespoke AI model development more accessible without high-end cloud infrastructure.

100% relevant

Open-source AI system running on $500 GPU reportedly outperforms Claude Sonnet

An open-source AI system running on consumer-grade $500 GPU hardware claims to outperform Anthropic's Claude Sonnet model while costing only $0.004 per task, eliminating cloud dependencies and API costs.

85% relevant

IREN Buys Nostrum Group, Adds 490MW Spanish Grid Power for AI Cloud

IREN acquired Nostrum Group, adding 490MW grid power in Spain for AI cloud expansion.

70% relevant

TensorWave Raises $350M Series B for AMD-Powered GPU Clusters

TensorWave raised $350M Series B for AMD-powered GPU clusters in North America, challenging Nvidia's dominance.

78% relevant

Apple Ditches Apple Silicon Pledge, Routes AI Queries to Google Cloud

Apple routes AI queries to Google Cloud, breaking 2024 Apple silicon pledge. Distilled Gemini runs locally; heavier queries use Nvidia tech in Google Cloud.

94% relevant

Nvidia Networking Revenue Hits $14.8B, Up 199% as AI Spending Shifts Beyond GPUs

Nvidia's Q1 FY2027 networking revenue surged 199% to $14.8B, signaling AI infrastructure spending is moving beyond GPUs into full-system networking. New reporting splits into Hyperscale and ACIE segments reflect a broadening customer base beyond hyperscalers.

100% relevant

vLLM Optimizations Cut Voice AI Latency by 40% on 6-GPU Cluster

vLLM optimizations on a 6-GPU cluster reduced voice AI latency by 40% for a Qwen-based system, enabling 500 concurrent sessions per node without hardware upgrades.

82% relevant

CoreWeave, Nebius Earnings Show AI Race Shifts From GPUs to Power

CoreWeave and Nebius Q1 earnings show AI infrastructure race shifting from GPU supply to power and scale, with combined capex guidance exceeding $55B.

90% relevant

MLX CUDA Backend Passes All Tests, Closing Apple GPU Gap

MLX CUDA backend passes all tests, enabling NVIDIA GPU support. Milestone bridges Apple Silicon and CUDA ecosystems for ML workloads.

77% relevant

NHN Deploys 7,656-GPU AI Cluster in Seoul

NHN launched a 7,656-GPU cluster in Seoul, South Korea, for domestic enterprise AI workloads. The cluster targets inference and training, competing with Naver and Kakao.

90% relevant

AMD Launches PCIe GPU for AI Workloads, Targets Existing Server Install Base

AMD launched a PCIe-based GPU for AI workloads, targeting existing servers. The card provides immediate boost without new data center buildouts.

90% relevant

Anthropic's 220K GPU Cluster: $5B Compute Bet Revealed

Anthropic reportedly has 220K NVIDIA GPUs and 310MW, implying a >$5B compute cluster, 3x OpenAI's largest.

100% relevant

Nscale to Deploy 66K+ Rubin GPUs for Microsoft in Portugal

Nscale will deploy 66,000+ NVIDIA Rubin GPUs for Microsoft at Portugal's Start Campus. The deal is a first for Rubin and signals Microsoft's geographic diversification.

80% relevant

Open-Weight 1T Model Inference Margins Hit 88% on Rented GPUs

Renting a 128 GPU cluster to serve a 1T open model yields ~88% margin on tokens sold at $0.002/1K, exposing a structural arbitrage over proprietary APIs.

85% relevant

Nebius Claims First NVIDIA GB300 Exemplar Cloud for Training

Nebius becomes first cloud provider validated as NVIDIA Exemplar Cloud on GB300 for training, targeting hyperscale AI workloads.

94% relevant

Moore Threads Q1 Revenue Up, Building 100K-GPU AI Cluster

Moore Threads reports Q1 2026 revenue growth and confirms progress building a 100,000-GPU cluster for AI training, signaling growing domestic AI infrastructure in China despite US export controls.

74% relevant

Oracle Nabs $16B for Michigan AI Data Center, Rivaling Google Cloud

Oracle has secured $16 billion in funding for a massive AI data center in rural Michigan, a move that pits it directly against Google Cloud and other hyperscalers in the race to build AI infrastructure.

76% relevant

SemiAnalysis: NVIDIA's Customer Data Drives Disaggregated Inference, LPU Surpasses GPU

SemiAnalysis states NVIDIA's direct customer feedback is leading the industry toward disaggregated inference architectures. In this model, specialized LPUs can outperform GPUs for specific pipeline tasks.

85% relevant

DARPA Leases 50 Nvidia H100 GPUs for Biological AI Program

DARPA's Biological Technologies Office is procuring 50 Nvidia HGX H100 GPU systems for its NODES program, with hardware delivery required within one month. This represents a significant government investment in AI infrastructure for biological research applications.

86% relevant

NVIDIA, Google Cloud Expand AI Partnership for Agentic & Physical AI

NVIDIA and Google Cloud announced an expanded partnership to advance agentic and physical AI, focusing on new infrastructure and software integrations. This builds on their existing collaboration to provide optimized AI training and inference platforms.

100% relevant

Cisco Reveals Scale-Across GPU Networking Needs 14x DCI Bandwidth

Cisco's chief architect detailed the massive bandwidth requirements for connecting AI clusters via 'scale-across' GPU networking, which needs 14x the capacity of traditional data center interconnects. This shift is creating a multi-billion dollar market for 800G coherent pluggables and deep-buffered switches.

85% relevant

AI Compute Crisis: GPU Prices Up 48%, Anthropic API at 98.95% Uptime

The AI industry faces a severe compute capacity crisis, with GPU prices up 48%, Anthropic API uptime falling to 98.95%, and OpenAI shutting down Sora to reallocate resources. Demand for agentic AI is outstripping supply, forcing rationing and product cancellations.

100% relevant