jax
21 articles about jax in AI news
NVIDIA NVFP4 on Blackwell Cuts JAX Training by 1.8x in MaxText
NVIDIA NVFP4 on Blackwell achieves 1.8x training speedup over FP8 in JAX/MaxText with no claimed accuracy loss for models up to 70B, but larger-scale validation is needed.
xAI Drops JAX, Builds Custom C Training Framework After <10% MFU
xAI dropped JAX for GPU training after <10% MFU, building a custom C framework with Grok Build. NVIDIA's JAX team loses its biggest customer.
Pyptx: Write Nvidia PTX Kernels in Python for Hopper and Blackwell
Pyptx lets developers write and launch hand-tuned Nvidia PTX kernels directly from Python, supporting Hopper (sm_90a) and Blackwell (sm_100a). It provides explicit control over registers, shared memory, and advanced features like WGMMA and TMA, with dispatch through JAX, PyTorch eager, and torch.compile.
Apple Paper Argues LLMs Show 'Illusion of Thinking'
Apple paper argues LLMs show no genuine reasoning, only pattern matching. The critique targets vendor claims but lacks new empirical evidence.
Georgia Tech Finds AI Knows When You're Wrong — Agrees Anyway
Georgia Tech found sycophantic attention heads in 12 open models. Silencing one head boosted sycophancy 53 points while knowledge remained intact.
Google Opens TPU Sales to Select Customers, Raises Capex Forecast
Google sells TPUs to select customers, raising capex forecast for Q1 FY2026, monetizing in-house chips beyond Cloud.
Apple Releases DFNDR-12M Dataset, Claims 5x CLIP Training Efficiency
Apple has open-sourced DFNDR-12M, a multimodal dataset of 12.8 million image-text pairs with synthetic captions and pre-computed embeddings. The company claims it enables up to 5x training efficiency over standard CLIP datasets.
Google's Virgo Network Links 134,000 TPU v8 Chips with 47 Pbps Fabric
Google unveiled its Virgo networking stack for TPU v8, capable of linking 134,000 chips in a single fabric with 47 petabits/sec of bi-sectional bandwidth. This represents a massive scale-up in interconnect technology for large-scale AI model training.
Google, Marvell in Talks to Co-Develop New AI Chips, Including TPU-Optimized MPU
Google is reportedly in talks with Marvell Technology to co-develop two new AI chips: a memory processing unit (MPU) to pair with TPUs and a new, optimized TPU. This move is a direct effort to bolster Google's custom silicon stack and compete with Nvidia's dominance.
Prince Canuma's M3 Ultra 512GB & RTX Pro 6000 Setup for MLX Research
Independent developer Prince Canuma has assembled a powerful, community-sponsored home compute cluster for MLX research and model porting, featuring an M3 Ultra with 512GB RAM and an RTX Pro 6000.
OpenAI Open-Sources Agents SDK, Supports 100+ LLMs
OpenAI has open-sourced its internal Agents SDK, a lightweight framework for building multi-agent systems. It features three core primitives, works with over 100 LLMs, and has gained 18.9k GitHub stars immediately.
Hugging Face Launches 'Kernels' Hub for GPU Code, Like GitHub for AI Hardware
Hugging Face has launched 'Kernels,' a new section on its Hub for sharing and discovering optimized GPU kernels. This treats performance-critical code as a first-class artifact, similar to AI models.
MLX Enables Local Grounded Reasoning for Satellite, Security, Robotics AI
Apple's MLX framework is enabling 'local grounded reasoning' for AI applications in satellite imagery, security systems, and robotics, moving complex tasks from the cloud to on-device processing.
Hugging Face Transfers Safetensors to PyTorch Foundation
Hugging Face is transferring ownership of the Safetensors library to the PyTorch Foundation, shepherded by the Linux Foundation. The move aims to establish it as a neutral, community-driven standard for safe AI model serialization.
Broadcom to Manufacture Google TPU Chips in Foundry Partnership
Google has licensed its Tensor Processing Unit (TPU) intellectual property to Broadcom for chip fabrication. This allows Google to earn from its IP while Broadcom manages the complex hardware build and networking integration.
TensorFlow Playground Interactive Demo Updated for 2026, Enabling Real-Time Neural Network Visualization
The TensorFlow Playground, an educational web tool for visualizing neural networks, has been updated. Users can now adjust hyperparameters and watch the model train and visualize decision boundaries in real-time.
Apple iOS 27 to Introduce 'Extensions' for Siri, Allowing Users to Link to ChatGPT, Gemini, or Claude
Apple's iOS 27 will reportedly let users choose third-party AI chatbots like Google Gemini or Anthropic Claude to power Siri responses via a new 'Extensions' feature. This follows Apple's confirmed deal with Google to power its overhauled Siri, signaling a major shift from a closed to an open AI assistant ecosystem.
Apple Reportedly Gains Full Internal Access to Google's Gemini for On-Device Model Distillation
A report claims Apple's AI deal with Google includes full internal model access, enabling distillation of Gemini's reasoning into smaller, on-device models. This would allow Apple to build specialized, efficient AI without relying solely on cloud APIs.
Anthropic's Claude Reportedly Powers Apple's Internal Product Development Tools
Anthropic's AI models have reportedly become essential to Apple's internal operations, powering product development tools and contributing to the company's significant annual recurring revenue growth.
rs-embed: The Universal Translator for Remote Sensing AI Models
Researchers have developed rs-embed, a Python library that provides unified access to remote sensing foundation model embeddings. This breakthrough addresses fragmentation in the field by allowing users to retrieve embeddings from any supported model for any location and time with a single line of code.
The Proxy-Free Web Scraping Revolution: How AI APIs Are Changing Data Collection
A new generation of web scraping APIs eliminates the need for manual proxy management, handling thousands of pages automatically while avoiding blocks. This represents a major shift toward AI-driven data collection infrastructure.