Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

silicon

30 articles about silicon in AI news

CNAS Report: AI Hits Silicon Wall as Chip Supply Trails $700B CapEx

CNAS report warns semiconductor manufacturing cannot keep pace with AI demand as hyperscalers plan $700B+ CapEx in 2026. Silicon replaces power as the near-term constraint.

90% relevant

mlx-vlm v0.5.0 Adds Continuous Batching, Distributed Inference for Apple Silicon

mlx-vlm v0.5.0 adds continuous batching, speculative decoding, and distributed inference for Apple Silicon. The release supports Qwen3.5, Kimi K2.5, Gemma 4 video, and new models with 21 contributors.

87% relevant

GUC, Wiwynn Partner on Silicon-to-System AI Infrastructure for Hyperscalers

GUC and Wiwynn partner on silicon-to-system AI infrastructure, integrating SoC design, optical I/O, and liquid cooling for hyperscalers.

83% relevant

Qualcomm Ships Hyperscaler Custom Silicon by December 2026

Qualcomm is developing custom silicon for an unnamed hyperscaler, with shipments expected December 2026, marking its most concrete data-center comeback move.

76% relevant

Qualcomm Builds Dedicated CPU for Agentic AI, Enters Hyperscale Silicon Market

Qualcomm CEO revealed dedicated CPU for agentic AI, custom silicon deal with hyperscaler shipping Dec 2026, and agentic smartphones. Pivot challenges GPU-centric AI infrastructure consensus.

100% relevant

DeepSeek-V4 Ported to MLX for Apple Silicon Inference

A developer has ported DeepSeek-V4 to Apple's MLX framework, allowing the large language model to run on Apple Silicon Macs. Early results show functional inference with room for optimization.

100% relevant

Nvidia's Silicon Photonics Roadmap Targets AI Data Center Bottlenecks

Nvidia is developing its own silicon photonics-based interconnects to address the growing data transfer bottleneck within AI data centers and supercomputers. This move is critical as AI model size and cluster scale continue to grow exponentially.

86% relevant

MLX-Benchmark Suite Launches as First Comprehensive LLM Eval for Apple Silicon

The MLX-Benchmark Suite has been released as the first comprehensive evaluation framework for Large Language Models running on Apple's MLX framework. It provides standardized metrics for models optimized for Apple Silicon hardware.

85% relevant

MLX-VLM Adds Continuous Batching, OpenAI API, and Vision Cache for Apple Silicon

The next release of MLX-VLM will introduce continuous batching, an OpenAI-compatible API, and vision feature caching for multimodal models running locally on Apple Silicon. These optimizations promise up to 228x speedups on cache hits for models like Gemma4.

95% relevant

DFlash Brings Speculative Decoding to Apple Silicon via MLX

DFlash, a new open-source project, implements speculative decoding for large language models on Apple Silicon using the MLX framework, reportedly delivering up to 2.5x speedup on an M5 Max.

85% relevant

MLX-LM v0.9.0 Adds Better Batching, Supports Gemma 4 on Apple Silicon

Apple's MLX-LM framework released version 0.9.0 with enhanced server batching and support for Google's Gemma 4 model, improving local LLM inference efficiency on Apple Silicon. This update addresses a key performance bottleneck for developers running models locally on Mac hardware.

75% relevant

Gemma 4 Ported to MLX-Swift, Runs Locally on Apple Silicon

Google's Gemma 4 language model has been ported to the MLX-Swift framework by a community developer, making it available for local inference on Apple Silicon Macs and iOS devices through the LocallyAI app.

87% relevant

Apple Silicon Achieves Near-Lossless LLM Compression at 3.5 Bits-Per-Weight, Claims Independent Tester

Independent AI researcher Matthew Weinbach reports achieving near-lossless compression of large language models on Apple Silicon, storing models at 3.5 bits-per-weight while maintaining within 1-2% quality of bf16 precision.

87% relevant

mlx-vlm v0.4.2 Adds SAM3, DOTS-MOCR Models and Critical Fixes for Vision-Language Inference on Apple Silicon

mlx-vlm v0.4.2 released with support for Meta's SAM3 segmentation model and DOTS-MOCR document OCR, plus fixes for Qwen3.5, LFM2-VL, and Magistral models. Enables efficient vision-language inference on Apple Silicon via MLX framework.

89% relevant

Qwen3-TTS Added to mlx-tune, Enabling Full Qwen Model Fine-Tuning on Apple Silicon Macs

The mlx-tune library now supports Qwen3-TTS, making the entire Qwen model stack—including the new text-to-speech model—fine-tunable on Apple Silicon Macs. This expands local AI development options for researchers and developers.

85% relevant

Silicon Photonics Breakthrough Enters Mass Production, Paving Way for Next-Generation AI Infrastructure

STMicroelectronics has begun mass production of its PIC100 silicon photonics platform, enabling 800G and 1.6T data rates critical for AI data centers. This breakthrough technology replaces copper with light for faster, more efficient data transmission between AI accelerators.

85% relevant

RunAnywhere's MetalRT Engine Delivers Breakthrough AI Performance on Apple Silicon

RunAnywhere has launched MetalRT, a proprietary GPU inference engine that dramatically accelerates on-device AI workloads on Apple Silicon. Their open-source RCLI tool demonstrates sub-200ms voice AI pipelines, outperforming existing solutions like llama.cpp and Apple's MLX.

80% relevant

Silicon Valley AI Startup Targets Japan's Industrial Robotics Crown

Former Google AI researchers have launched Integral AI in Tokyo, aiming to transform Japan's massive industrial robotics sector with AI models that teach robots through observation and language prompts. The startup is already in talks with Toyota, Sony, and other manufacturing giants.

75% relevant

Silicon Valley Titan Declares AI Race with China a 'Techno-Economic War'

Billionaire venture capitalist Vinod Khosla frames the U.S.-China AI competition as an existential battle for global economic and geopolitical dominance, warning against underestimating its stakes.

85% relevant

The Silicon Shift: How AI Offloading is Redefining Professional Competence

A paradigm shift is underway where professional competence increasingly depends on effectively leveraging AI tools rather than raw cognitive ability. This transformation is collapsing traditional seniority hierarchies and commoditizing intelligence across industries.

85% relevant

Apple's M5 Pro and Max: Fusion Architecture Redefines AI Computing on Silicon

Apple unveils M5 Pro and M5 Max chips with groundbreaking Fusion Architecture, merging two 3nm dies into a single SoC. The chips deliver up to 30% faster CPU performance and over 4x peak GPU compute for AI workloads compared to previous generations.

95% relevant

mlx-audio v0.4.3 Ships 6 New TTS Models, Slimmer Deps

mlx-audio v0.4.3 adds 6 TTS models, server concurrency, and slims dependencies, targeting Apple Silicon developers.

85% relevant

Nvidia Invests $2B in Marvell to Deepen NVLink Fusion Tie-Up

Nvidia invested $2B in Marvell to deepen NVLink Fusion partnership, integrating Marvell custom silicon into AI interconnect fabric.

87% relevant

Google Splits TPU Line: 8t for Training, 8i for Inference

At Cloud Next 2026, Google introduced two new AI chips — TPU 8t for training and TPU 8i for inference — splitting its custom silicon for the first time. OpenAI, Anthropic, and Meta are buying multi-gigawatt TPU capacity, signaling a crack in NVIDIA's 81% market share.

100% relevant

Developer Achieves 395x RTFx on M5 Max with Fastest Parakeet v3 for Apple ANE

Developer @mweinbach has optimized the Parakeet v3 speech recognition model for Apple's Neural Engine, achieving a 395x real-time factor on an M5 Max chip. This represents a significant performance leap for on-device AI inference on Apple Silicon.

87% relevant

Google, Marvell in Talks to Co-Develop New AI Chips, Including TPU-Optimized MPU

Google is reportedly in talks with Marvell Technology to co-develop two new AI chips: a memory processing unit (MPU) to pair with TPUs and a new, optimized TPU. This move is a direct effort to bolster Google's custom silicon stack and compete with Nvidia's dominance.

95% relevant

AirTrain Enables Distributed ML Training on MacBooks Over Wi-Fi

Developer @AlexanderCodes_ open-sourced AirTrain, a tool that enables distributed ML training across Apple Silicon MacBooks using Wi-Fi by syncing gradients every 500 steps instead of every step. This makes personal device training feasible for models up to 70B parameters without cloud GPU costs.

95% relevant

Qwen2.5-7B-Instruct 4-bit DWQ Model Released for Apple MLX

A developer has ported a 4-bit quantized Qwen2.5-7B-Instruct model to Apple's MLX framework. This makes the capable 7B model more efficient to run on Apple Silicon Macs.

77% relevant

Meta Expands Broadcom Partnership for Next-Gen AI Infrastructure

Meta is expanding its partnership with semiconductor giant Broadcom to co-develop its next-generation AI infrastructure. This move signals a continued, long-term commitment to custom silicon for AI training and inference.

85% relevant

Mac Studio AI Hardware Shortage Signals Shift to Cloud Rentals

Developers report a global shortage of high-memory Apple Silicon Macs, with 128GB Mac Studios unavailable worldwide. This pushes practitioners toward renting cloud H100 GPUs at ~$3/hr, marking a shift from the recent local AI trend.

85% relevant