apple
30 articles about apple in AI news
Apple Paper Argues LLMs Show 'Illusion of Thinking'
Apple paper argues LLMs show no genuine reasoning, only pattern matching. The critique targets vendor claims but lacks new empirical evidence.
MLX CUDA Backend Passes All Tests, Closing Apple GPU Gap
MLX CUDA backend passes all tests, enabling NVIDIA GPU support. Milestone bridges Apple Silicon and CUDA ecosystems for ML workloads.
Google Beats Apple to AI Health Coach With Gemini-Powered Fitbit App
Google released an AI health coach using Gemini, beating Apple to market. The coach integrates fitness, sleep, nutrition, cycle tracking, weather, and U.S. medical records.
mlx-vlm v0.5.0 Adds Continuous Batching, Distributed Inference for Apple Silicon
mlx-vlm v0.5.0 adds continuous batching, speculative decoding, and distributed inference for Apple Silicon. The release supports Qwen3.5, Kimi K2.5, Gemma 4 video, and new models with 21 contributors.
Apple WWDC 2026: Gemini Deeply Integrated into iOS
A tweet from @kimmonismus claims Apple's 2026 WWDC will be the most exciting yet, with the first deep integration of a useful AI model (Gemini) into iOS and a new Apple CEO.
DeepSeek-V4 Ported to MLX for Apple Silicon Inference
A developer has ported DeepSeek-V4 to Apple's MLX framework, allowing the large language model to run on Apple Silicon Macs. Early results show functional inference with room for optimization.
Apple Releases DFNDR-12M Dataset, Claims 5x CLIP Training Efficiency
Apple has open-sourced DFNDR-12M, a multimodal dataset of 12.8 million image-text pairs with synthetic captions and pre-computed embeddings. The company claims it enables up to 5x training efficiency over standard CLIP datasets.
Developer Achieves 395x RTFx on M5 Max with Fastest Parakeet v3 for Apple ANE
Developer @mweinbach has optimized the Parakeet v3 speech recognition model for Apple's Neural Engine, achieving a 395x real-time factor on an M5 Max chip. This represents a significant performance leap for on-device AI inference on Apple Silicon.
John Ternus Takes Over Apple AI Leadership as Era Ends
Apple's AI leadership transitions to John Ternus, marking a new era following Steve Jobs' vision and Tim Cook's operational success. This comes as Apple accelerates its generative AI push with Apple Intelligence.
Apple's 'Attention to Mamba' Paper Proposes Cross-Architecture Transfer
Apple researchers introduced a two-stage recipe for transferring capabilities from Transformer models to Mamba-based architectures. This could enable efficient models that retain the performance of larger, attention-based predecessors.
MLX-Benchmark Suite Launches as First Comprehensive LLM Eval for Apple Silicon
The MLX-Benchmark Suite has been released as the first comprehensive evaluation framework for Large Language Models running on Apple's MLX framework. It provides standardized metrics for models optimized for Apple Silicon hardware.
Qwen2.5-7B-Instruct 4-bit DWQ Model Released for Apple MLX
A developer has ported a 4-bit quantized Qwen2.5-7B-Instruct model to Apple's MLX framework. This makes the capable 7B model more efficient to run on Apple Silicon Macs.
MLX-VLM Adds Continuous Batching, OpenAI API, and Vision Cache for Apple Silicon
The next release of MLX-VLM will introduce continuous batching, an OpenAI-compatible API, and vision feature caching for multimodal models running locally on Apple Silicon. These optimizations promise up to 228x speedups on cache hits for models like Gemma4.
Apple Sends 200 Siri Engineers to AI Coding Bootcamp Ahead of WWDC
Apple is sending ~200 Siri engineers to a multi-week bootcamp to learn AI coding tools like Claude Code and Codex. This retraining precedes the expected June WWDC unveiling of a Gemini-powered Siri overhaul.
Nvidia to Ship 1.19 Exabytes of HBM in 2026, Apple iPhone Memory 2x Larger
An analysis projects Nvidia will ship ~1.19 exabytes of HBM memory in 2026 for AI infrastructure, while Apple will ship ~2.4 exabytes of LPDDR5 for iPhones, putting AI's massive hardware scale in consumer market perspective.
DFlash Brings Speculative Decoding to Apple Silicon via MLX
DFlash, a new open-source project, implements speculative decoding for large language models on Apple Silicon using the MLX framework, reportedly delivering up to 2.5x speedup on an M5 Max.
Apple Reportedly Developing 'Balta' AI ASIC for Cloud Compute
A Morgan Stanley report indicates Apple is accelerating development of a custom ASIC, codenamed 'Balta,' for AI cloud and hybrid compute. This marks Apple's first known move to design silicon for its data centers, not just consumer devices.
MLX-LM v0.9.0 Adds Better Batching, Supports Gemma 4 on Apple Silicon
Apple's MLX-LM framework released version 0.9.0 with enhanced server batching and support for Google's Gemma 4 model, improving local LLM inference efficiency on Apple Silicon. This update addresses a key performance bottleneck for developers running models locally on Mac hardware.
Apple's Studio Display XDR Medical Imaging Calibration Receives FDA Clearance
Apple's Medical Imaging Calibration feature for the Studio Display XDR has received FDA clearance. This allows the consumer-grade display to be used for primary diagnosis of medical images in the US.
Qualcomm X2 Elite Matches Apple M5 in Efficiency Test
In a mixed-use laptop test simulating office work, Qualcomm's Snapdragon X2 Elite system-on-chip matched the power efficiency of Apple's latest M5 chip. This marks a significant milestone for Windows on Arm in its competition with Apple Silicon.
Apple's AI Mac Mini Sells Out, Signaling Unprecedented Demand
Apple's latest Mac mini, featuring its new Apple Intelligence silicon, has sold out across retailers—a first for the typically high-availability product line. This signals overwhelming initial demand for Apple's push into on-device AI computing.
Developer Ranks NPU Model Compilation Ease: Apple 1st, AMD Last
Developer @mweinbach ranked the ease of using AI coding agents to compile ML models for NPUs. Apple's ecosystem was rated easiest, while AMD's tooling was ranked most difficult.
Gemma 4 Ported to MLX-Swift, Runs Locally on Apple Silicon
Google's Gemma 4 language model has been ported to the MLX-Swift framework by a community developer, making it available for local inference on Apple Silicon Macs and iOS devices through the LocallyAI app.
Apple M5 Max NPU Benchmarks 2x Faster Than Intel Panther Lake NPU in Parakeet v3 AI Inference Test
A leaked benchmark using the Parakeet v3 AI speech recognition model shows Apple's next-generation M5 Max Neural Processing Unit (NPU) delivering double the inference speed of Intel's competing Panther Lake NPU. This real-world test provides early performance data in the intensifying on-device AI hardware race.
Apple's Eddy Cue to Appear on TBPN Podcast for Company's 50th Anniversary
Apple's senior vice president of services, Eddy Cue, will appear live on the TBPN podcast today at 12:10 PM PT. The interview is part of Apple's 50th-anniversary commemorations.
Roboflow's RF-DETR Model Ported to Apple MLX, Enabling Real-Time On-Device Instance Segmentation
Roboflow's RF-DETR object detection model is now available on Apple's MLX framework, enabling real-time instance segmentation on Apple Silicon devices. This port unlocks new on-device visual analysis applications for robotics and augmented vision-language models.
Apple Removes AI Coding Apps Replit & Vibecode from App Store, Coinciding with Xcode AI Integration
Apple has removed AI-powered coding apps Replit and Vibecode from the App Store, reportedly for enabling app creation outside Apple's approval system. This coincides with Apple's recent integration of its own AI coding assistant into Xcode.
Ollama Now Supports Apple MLX Backend for Local LLM Inference on macOS
Ollama, the popular framework for running large language models locally, has added support for Apple's MLX framework as a backend. This enables more efficient execution of models like Llama 3.2 and Mistral on Apple Silicon Macs.
Apple Silicon Achieves Near-Lossless LLM Compression at 3.5 Bits-Per-Weight, Claims Independent Tester
Independent AI researcher Matthew Weinbach reports achieving near-lossless compression of large language models on Apple Silicon, storing models at 3.5 bits-per-weight while maintaining within 1-2% quality of bf16 precision.
Facebook's SAM 3 Vision Model Ported to Apple's MLX Framework, Enabling Real-Time Tracking on M3 Max
Facebook's Segment Anything Model 3 (SAM 3) has been ported to Apple's MLX framework, enabling real-time object tracking on an M3 Max MacBook Pro. This demonstrates efficient on-device execution of a foundational vision model without cloud dependency.