Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

rocm

14 articles about rocm in AI news

AMD ROCm Performance Jumps 75x in 14 Days Post-DeepSeek v4

AMD ROCm stack improved 75x in 14 days post-DeepSeek v4 via fused operations. Still needs 5x more to match B200 performance.

100% relevant

Intel Targets Nvidia, AMD with New AI Chip Launch by End 2026

Intel plans to launch a new AI data center chip by end of 2026, targeting Nvidia and AMD in the AI infrastructure market.

72% relevant

AMD's Lemonade v10.8 Adds MCP Support, Letting Claude Desktop and Cursor Route Tasks to Local AMD GPUs

AMD-backed Lemonade v10.8, released June 17, now exposes a Model Context Protocol server, letting Claude Desktop, Cursor, and GitHub Copilot route inference tasks to local AMD Ryzen AI NPUs, Radeon GPUs, or plain CPUs — no cloud API required. The update also adds Moonshine speech-to-text, expanded R

70% relevant

OpenAI DeploymentSim predicts GPT-5 errors 92% of the time pre-launch

OpenAI's Deployment Simulation predicted GPT-5 errors with 92% accuracy using 1.3M real conversations, outperforming standard safety tests.

90% relevant

TensorWave Raises $350M Series B for AMD-Powered GPU Clusters

TensorWave raised $350M Series B for AMD-powered GPU clusters in North America, challenging Nvidia's dominance.

78% relevant

vLLM Optimizations Cut Voice AI Latency by 40% on 6-GPU Cluster

vLLM optimizations on a 6-GPU cluster reduced voice AI latency by 40% for a Qwen-based system, enabling 500 concurrent sessions per node without hardware upgrades.

82% relevant

AMD Gives OSS Maintainers $3.6M MI355X Cluster Access

AMD gives vLLM/SGLang maintainers $3.6M MI355X cluster access, ending NVIDIA's monopoly on OSS inference hardware access.

75% relevant

AMD Launches PCIe GPU for AI Workloads, Targets Existing Server Install Base

AMD launched a PCIe-based GPU for AI workloads, targeting existing servers. The card provides immediate boost without new data center buildouts.

90% relevant

AMD MI350P PCIe Card Claims 40% FP8 Lead Over Nvidia H200 NVL

AMD launched MI350P PCIe AI card with 144GB HBM3E, claiming 39% FP8 lead over Nvidia H200 NVL. Targets drop-in air-cooled server upgrades.

98% relevant

AMD Backs UALink Open Interconnect to Challenge NVIDIA NVLink in AI

AMD is supporting the newly formed UALink Consortium, which aims to create an open standard for connecting AI accelerators. This move challenges NVIDIA's control over the critical NVLink technology that underpins its AI data center systems.

84% relevant

Hugging Face Launches 'Kernels' Hub for GPU Code, Like GitHub for AI Hardware

Hugging Face has launched 'Kernels,' a new section on its Hub for sharing and discovering optimized GPU kernels. This treats performance-critical code as a first-class artifact, similar to AI models.

85% relevant

Developer Ranks NPU Model Compilation Ease: Apple 1st, AMD Last

Developer @mweinbach ranked the ease of using AI coding agents to compile ML models for NPUs. Apple's ecosystem was rated easiest, while AMD's tooling was ranked most difficult.

75% relevant

Ollama Now Supports Apple MLX Backend for Local LLM Inference on macOS

Ollama, the popular framework for running large language models locally, has added support for Apple's MLX framework as a backend. This enables more efficient execution of models like Llama 3.2 and Mistral on Apple Silicon Macs.

85% relevant

98× Faster LLM Routing Without a Dedicated GPU: Technical Breakthrough for vLLM Semantic Router

New research presents a three-stage optimization pipeline for the vLLM Semantic Router, achieving 98× speedup and enabling long-context classification on shared GPUs. This solves critical memory and latency bottlenecks for system-level LLM routing.

80% relevant