silicon
30 articles about silicon in AI news
CNAS Report: AI Hits Silicon Wall as Chip Supply Trails $700B CapEx
CNAS report warns semiconductor manufacturing cannot keep pace with AI demand as hyperscalers plan $700B+ CapEx in 2026. Silicon replaces power as the near-term constraint.
mlx-vlm v0.5.0 Adds Continuous Batching, Distributed Inference for Apple Silicon
mlx-vlm v0.5.0 adds continuous batching, speculative decoding, and distributed inference for Apple Silicon. The release supports Qwen3.5, Kimi K2.5, Gemma 4 video, and new models with 21 contributors.
GUC, Wiwynn Partner on Silicon-to-System AI Infrastructure for Hyperscalers
GUC and Wiwynn partner on silicon-to-system AI infrastructure, integrating SoC design, optical I/O, and liquid cooling for hyperscalers.
Qualcomm Ships Hyperscaler Custom Silicon by December 2026
Qualcomm is developing custom silicon for an unnamed hyperscaler, with shipments expected December 2026, marking its most concrete data-center comeback move.
Qualcomm Builds Dedicated CPU for Agentic AI, Enters Hyperscale Silicon Market
Qualcomm CEO revealed dedicated CPU for agentic AI, custom silicon deal with hyperscaler shipping Dec 2026, and agentic smartphones. Pivot challenges GPU-centric AI infrastructure consensus.
DeepSeek-V4 Ported to MLX for Apple Silicon Inference
A developer has ported DeepSeek-V4 to Apple's MLX framework, allowing the large language model to run on Apple Silicon Macs. Early results show functional inference with room for optimization.
Nvidia's Silicon Photonics Roadmap Targets AI Data Center Bottlenecks
Nvidia is developing its own silicon photonics-based interconnects to address the growing data transfer bottleneck within AI data centers and supercomputers. This move is critical as AI model size and cluster scale continue to grow exponentially.
MLX-Benchmark Suite Launches as First Comprehensive LLM Eval for Apple Silicon
The MLX-Benchmark Suite has been released as the first comprehensive evaluation framework for Large Language Models running on Apple's MLX framework. It provides standardized metrics for models optimized for Apple Silicon hardware.
MLX-VLM Adds Continuous Batching, OpenAI API, and Vision Cache for Apple Silicon
The next release of MLX-VLM will introduce continuous batching, an OpenAI-compatible API, and vision feature caching for multimodal models running locally on Apple Silicon. These optimizations promise up to 228x speedups on cache hits for models like Gemma4.
DFlash Brings Speculative Decoding to Apple Silicon via MLX
DFlash, a new open-source project, implements speculative decoding for large language models on Apple Silicon using the MLX framework, reportedly delivering up to 2.5x speedup on an M5 Max.
MLX-LM v0.9.0 Adds Better Batching, Supports Gemma 4 on Apple Silicon
Apple's MLX-LM framework released version 0.9.0 with enhanced server batching and support for Google's Gemma 4 model, improving local LLM inference efficiency on Apple Silicon. This update addresses a key performance bottleneck for developers running models locally on Mac hardware.
Gemma 4 Ported to MLX-Swift, Runs Locally on Apple Silicon
Google's Gemma 4 language model has been ported to the MLX-Swift framework by a community developer, making it available for local inference on Apple Silicon Macs and iOS devices through the LocallyAI app.
Apple Silicon Achieves Near-Lossless LLM Compression at 3.5 Bits-Per-Weight, Claims Independent Tester
Independent AI researcher Matthew Weinbach reports achieving near-lossless compression of large language models on Apple Silicon, storing models at 3.5 bits-per-weight while maintaining within 1-2% quality of bf16 precision.
mlx-vlm v0.4.2 Adds SAM3, DOTS-MOCR Models and Critical Fixes for Vision-Language Inference on Apple Silicon
mlx-vlm v0.4.2 released with support for Meta's SAM3 segmentation model and DOTS-MOCR document OCR, plus fixes for Qwen3.5, LFM2-VL, and Magistral models. Enables efficient vision-language inference on Apple Silicon via MLX framework.
Qwen3-TTS Added to mlx-tune, Enabling Full Qwen Model Fine-Tuning on Apple Silicon Macs
The mlx-tune library now supports Qwen3-TTS, making the entire Qwen model stack—including the new text-to-speech model—fine-tunable on Apple Silicon Macs. This expands local AI development options for researchers and developers.
Silicon Photonics Breakthrough Enters Mass Production, Paving Way for Next-Generation AI Infrastructure
STMicroelectronics has begun mass production of its PIC100 silicon photonics platform, enabling 800G and 1.6T data rates critical for AI data centers. This breakthrough technology replaces copper with light for faster, more efficient data transmission between AI accelerators.
RunAnywhere's MetalRT Engine Delivers Breakthrough AI Performance on Apple Silicon
RunAnywhere has launched MetalRT, a proprietary GPU inference engine that dramatically accelerates on-device AI workloads on Apple Silicon. Their open-source RCLI tool demonstrates sub-200ms voice AI pipelines, outperforming existing solutions like llama.cpp and Apple's MLX.
Silicon Valley AI Startup Targets Japan's Industrial Robotics Crown
Former Google AI researchers have launched Integral AI in Tokyo, aiming to transform Japan's massive industrial robotics sector with AI models that teach robots through observation and language prompts. The startup is already in talks with Toyota, Sony, and other manufacturing giants.
Silicon Valley Titan Declares AI Race with China a 'Techno-Economic War'
Billionaire venture capitalist Vinod Khosla frames the U.S.-China AI competition as an existential battle for global economic and geopolitical dominance, warning against underestimating its stakes.
The Silicon Shift: How AI Offloading is Redefining Professional Competence
A paradigm shift is underway where professional competence increasingly depends on effectively leveraging AI tools rather than raw cognitive ability. This transformation is collapsing traditional seniority hierarchies and commoditizing intelligence across industries.
Apple's M5 Pro and Max: Fusion Architecture Redefines AI Computing on Silicon
Apple unveils M5 Pro and M5 Max chips with groundbreaking Fusion Architecture, merging two 3nm dies into a single SoC. The chips deliver up to 30% faster CPU performance and over 4x peak GPU compute for AI workloads compared to previous generations.
mlx-audio v0.4.3 Ships 6 New TTS Models, Slimmer Deps
mlx-audio v0.4.3 adds 6 TTS models, server concurrency, and slims dependencies, targeting Apple Silicon developers.
Nvidia Invests $2B in Marvell to Deepen NVLink Fusion Tie-Up
Nvidia invested $2B in Marvell to deepen NVLink Fusion partnership, integrating Marvell custom silicon into AI interconnect fabric.
Google Splits TPU Line: 8t for Training, 8i for Inference
At Cloud Next 2026, Google introduced two new AI chips — TPU 8t for training and TPU 8i for inference — splitting its custom silicon for the first time. OpenAI, Anthropic, and Meta are buying multi-gigawatt TPU capacity, signaling a crack in NVIDIA's 81% market share.
Developer Achieves 395x RTFx on M5 Max with Fastest Parakeet v3 for Apple ANE
Developer @mweinbach has optimized the Parakeet v3 speech recognition model for Apple's Neural Engine, achieving a 395x real-time factor on an M5 Max chip. This represents a significant performance leap for on-device AI inference on Apple Silicon.
Google, Marvell in Talks to Co-Develop New AI Chips, Including TPU-Optimized MPU
Google is reportedly in talks with Marvell Technology to co-develop two new AI chips: a memory processing unit (MPU) to pair with TPUs and a new, optimized TPU. This move is a direct effort to bolster Google's custom silicon stack and compete with Nvidia's dominance.
AirTrain Enables Distributed ML Training on MacBooks Over Wi-Fi
Developer @AlexanderCodes_ open-sourced AirTrain, a tool that enables distributed ML training across Apple Silicon MacBooks using Wi-Fi by syncing gradients every 500 steps instead of every step. This makes personal device training feasible for models up to 70B parameters without cloud GPU costs.
Qwen2.5-7B-Instruct 4-bit DWQ Model Released for Apple MLX
A developer has ported a 4-bit quantized Qwen2.5-7B-Instruct model to Apple's MLX framework. This makes the capable 7B model more efficient to run on Apple Silicon Macs.
Meta Expands Broadcom Partnership for Next-Gen AI Infrastructure
Meta is expanding its partnership with semiconductor giant Broadcom to co-develop its next-generation AI infrastructure. This move signals a continued, long-term commitment to custom silicon for AI training and inference.
Mac Studio AI Hardware Shortage Signals Shift to Cloud Rentals
Developers report a global shortage of high-memory Apple Silicon Macs, with 128GB Mac Studios unavailable worldwide. This pushes practitioners toward renting cloud H100 GPUs at ~$3/hr, marking a shift from the recent local AI trend.