Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

cuda

30 articles about cuda in AI news

NanoEuler: GPT-2-Scale 116M Model Built in Pure C/CUDA From Scratch

NanoEuler is a 116M-parameter GPT-2-scale model built in pure C/CUDA from scratch. It provides a complete educational training pipeline for understanding LLMs at the lowest level.

75% relevant

MLX CUDA Backend Passes All Tests, Closing Apple GPU Gap

MLX CUDA backend passes all tests, enabling NVIDIA GPU support. Milestone bridges Apple Silicon and CUDA ecosystems for ML workloads.

77% relevant

OpenAI Codex Now Translates C++, CUDA, and Python to Swift and Python for CoreML Model Conversion

OpenAI's Codex AI code generator is now being used to automatically rewrite C++, CUDA, and Python code into Swift and Python specifically for CoreML model conversion, a previously manual and error-prone process for Apple ecosystem deployment.

89% relevant

ByteDance's CUDA Agent: The AI System Outperforming Human Experts in GPU Code Generation

ByteDance has unveiled CUDA Agent, a large-scale reinforcement learning system that generates high-performance CUDA kernels. The system achieves state-of-the-art results, outperforming torch.compile by up to 100% and beating leading AI models like Claude Opus 4.5 and Gemini 3 Pro by approximately 40% on the most challenging tasks.

95% relevant

WSL 3 Preview: Cut Claude Code's Local Inference Latency on Windows

WSL 3 preview delivers near-native GPU/NPU for Claude Code + Ollama on Copilot+ laptops, but WSL 2 still handles NVIDIA CUDA fine for desktop users.

98% relevant

LlamaFactory Enables No-Code Fine-Tuning for 100+ LLMs Including Llama 4, Qwen, and DeepSeek

The LlamaFactory project eliminates traditional fine-tuning complexity with a drag-and-click interface, supporting over 100 models. This reduces setup from hours of boilerplate code and CUDA debugging to a visual workflow.

87% relevant

Nvidia's Open-Source Gambit: NeMoClaw Aims to Tame Enterprise AI Agents

Nvidia is preparing to launch NeMoClaw, an open-source platform designed for building secure, autonomous AI agents for enterprise workflows. Breaking from its proprietary CUDA tradition, the move targets software ecosystem dominance regardless of hardware.

97% relevant

Jim Keller: Tenstorrent IPO Looms as BlackHole Chip Scales

Jim Keller confirmed Tenstorrent's IPO plans as BlackHole chip scales for AI inference, competing with Nvidia. No revenue disclosed.

98% relevant

OpenAI-Broadcom Chip Hints at Token Price Collapse

OpenAI and Broadcom are co-developing a custom AI inference chip that could cut token prices by an order of magnitude, per @mweinbach. The chip targets inference workloads, not training, and aims to reduce dependency on Nvidia.

75% relevant

NVIDIA Vera Rubin: One Rack Matches TOP500, 35 EU Labs Deploy

NVIDIA's Vera Rubin NVL72 delivers TOP500-class performance in a single rack, with 35 European labs deploying the system for AI and HPC.

95% relevant

How Simon Willison Ported a 0.2B Image Model to the Browser with Claude

Simon Willison used Claude Code to port a 0.2B image inpainting model to WebGPU, running it as a parallel side project while his main agent worked on Datasette. The technique? Research with Claude.ai, then hand off to Claude Code with research.md.

70% relevant

Qualcomm in Talks to Acquire Modular for $4B, Landing Lattner

Qualcomm nears $4B acquisition of Modular, Chris Lattner's AI infra startup. Deal targets inference software for edge and data center AI chips.

82% relevant

NVFP4 GEMM on RTX Pro Blackwell: SM12x Breaks from B200 Programming Model

NVIDIA's SM12x architecture drops tcgen05.mma for mma.sync, breaking B200 kernel compatibility. SM8x kernels port easily; developers must maintain separate codebases.

86% relevant

Intel Targets Nvidia, AMD with New AI Chip Launch by End 2026

Intel plans to launch a new AI data center chip by end of 2026, targeting Nvidia and AMD in the AI infrastructure market.

72% relevant

AWS Beats Cloud Rivals to NVIDIA Blackwell with EC2 G7 — 4.6x AI Inference Gain Over G6

AWS launched EC2 G7 instances on June 19, 2026, becoming the first major cloud to offer NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs. The instances claim 4.6x AI inference performance over G6, backed by 700 Gbps EFA networking and 32 GB GDDR7 per GPU. The move arrives the same week AWS confirme

85% relevant

NVIDIA, GENCI Launch AI Factory France Compute Access for Startups

NVIDIA and GENCI launched AI Factory France at VivaTech, giving European startups free access to AI supercomputers. The program includes compute, tools, and expert support for NVIDIA Inception members.

90% relevant

Tensordyne Claims 10x Efficiency Gain with Napier Architecture

Tensordyne claims 10x efficiency over Nvidia in inference with Napier gen, but lacks data or verification.

85% relevant

AMD's Lemonade v10.8 Adds MCP Support, Letting Claude Desktop and Cursor Route Tasks to Local AMD GPUs

AMD-backed Lemonade v10.8, released June 17, now exposes a Model Context Protocol server, letting Claude Desktop, Cursor, and GitHub Copilot route inference tasks to local AMD Ryzen AI NPUs, Radeon GPUs, or plain CPUs — no cloud API required. The update also adds Moonshine speech-to-text, expanded R

70% relevant

Qualcomm Launches AI Data Center Program With Hyperscaler Customer

Qualcomm launched an AI data center program with a major hyperscaler customer, targeting inference workloads. Financial terms and partner identity undisclosed.

85% relevant

Intel Omni-Path Resurfaces as InfiniBand Rival for DoE Supercomputers

Intel's Omni-Path interconnect, revived by Cornelis Networks, will connect DoE supercomputers at 400Gbps as an InfiniBand alternative.

90% relevant

Cerebras Claims Performance Parity With Nvidia H100 on AI Training

Cerebras claims wafer-scale chips match Nvidia H100 on AI training performance per watt, challenging Nvidia's dominance.

92% relevant

NVIDIA Blackwell Ultra Leads First Agentic AI Benchmark, 20x Agents/MW vs Hopper

NVIDIA Blackwell Ultra NVL72 leads the first AgentPerf benchmark for agentic AI, delivering 20x more agents per megawatt than Hopper.

92% relevant

TensorWave Raises $350M Series B for AMD-Powered GPU Clusters

TensorWave raised $350M Series B for AMD-powered GPU clusters in North America, challenging Nvidia's dominance.

78% relevant

Nvidia Buys Kumo AI for $400M to Predict from Business Data

Nvidia acquired Kumo AI for $400M+ to bring foundation model predictions to enterprise relational data, filling a gap left by LLMs.

88% relevant

Foxconn and Intel Partner on AI Data Center Rack Systems

Foxconn and Intel partner on AI rack systems, integrating Intel components into Foxconn manufacturing for hyperscale customers. No financial terms disclosed.

90% relevant

Nvidia, Unitree, Sharpa unveil H2+ humanoid robot reference design

Nvidia, Unitree, and Sharpa released H2+, a humanoid robot reference design, at Computex 2026 to standardize physical AI development workflows.

90% relevant

Nvidia Unveils New Windows SoC, Targeting AI PCs

Nvidia announced a Windows SoC for AI PCs, per @mweinbach. Chip targets on-device inference, competing with Qualcomm and Intel.

100% relevant

NVIDIA Nemotron 3 Ultra: 550B Open-Weight Model Challenges GLM, Kimi

NVIDIA released Nemotron 3 Ultra, a 550B open-weight model claiming near-SOTA performance, competing with GLM-5.1 and Kimi K2.6. No benchmarks yet.

87% relevant

xAI Drops JAX, Builds Custom C Training Framework After <10% MFU

xAI dropped JAX for GPU training after <10% MFU, building a custom C framework with Grok Build. NVIDIA's JAX team loses its biggest customer.

91% relevant

Jensen Huang Wants Zero Coding at NVIDIA — 'Purpose vs Task'

Jensen Huang wants zero coding by NVIDIA engineers, framing it as a task to minimize. The bet is AI-generated code will match human output for performance-critical software.

77% relevant