qualcomm
29 articles about qualcomm in AI news
Qualcomm Ships Hyperscaler Custom Silicon by December 2026
Qualcomm is developing custom silicon for an unnamed hyperscaler, with shipments expected December 2026, marking its most concrete data-center comeback move.
Qualcomm Builds Dedicated CPU for Agentic AI, Enters Hyperscale Silicon Market
Qualcomm CEO revealed dedicated CPU for agentic AI, custom silicon deal with hyperscaler shipping Dec 2026, and agentic smartphones. Pivot challenges GPU-centric AI infrastructure consensus.
GPT-5.4 Spends 3 Hours Optimizing Embedding Model for Qualcomm NPU
An X user observed GPT-5.4 working for three hours to optimize an embedding model specifically for the Qualcomm NPU. This suggests a practical application of advanced AI for hardware-specific model tuning.
ASUS Zenbook A16 Launches with Qualcomm X2 Elite Extreme AI Chip
ASUS announced the Zenbook A16 laptop featuring the Qualcomm Snapdragon X2 Elite Extreme processor. This marks a significant push for premium Windows on Arm laptops optimized for local AI tasks.
Snap & Qualcomm Partner on Snapdragon XR for Future Spectacles
Snap has entered a strategic agreement with Qualcomm to power future generations of its Spectacles AR glasses with Snapdragon XR platforms. This hardware partnership is critical for Snap's long-term bet on AI-driven augmented reality.
Qualcomm X2 Elite Matches Apple M5 in Efficiency Test
In a mixed-use laptop test simulating office work, Qualcomm's Snapdragon X2 Elite system-on-chip matched the power efficiency of Apple's latest M5 chip. This marks a significant milestone for Windows on Arm in its competition with Apple Silicon.
Qualcomm NPU Shows 6-8x OCR Speed-Up Over CPU in Mobile Workload
A benchmark shows Qualcomm's dedicated NPU processing OCR workloads 6-8 times faster than the device's CPU. This highlights the growing efficiency gap for AI tasks on mobile silicon.
Qualcomm's Arduino Ventuno Q: A Powerhouse Single-Board Computer for the Next Wave of Physical AI
Qualcomm and Arduino have launched the Ventuno Q, a high-performance single-board computer designed specifically for robotics and physical AI applications. Powered by the Dragonwing IQ8 processor with a dedicated NPU and paired with a low-latency microcontroller, it enables complex, offline AI tasks like object tracking and gesture recognition for systems that interact with the real world.
Snapdragon X2 Elite Beats Intel Arrow Lake for AI Coding Agents
Snapdragon X2 Elite beat Intel Arrow Lake for Windows AI coding agents. CPU bottleneck, not inference speed, limited performance per @mweinbach.
Horizon Launches Full-Stack AI Platform for Autonomous Driving
Horizon Robotics launched a trio of products—a new chip, an open-source OS, and a smart driving system—aiming to push cars closer to becoming autonomous AI agents. The platform integrates hardware and software for enhanced perception and decision-making.
John Ternus Takes Over Apple AI Leadership as Era Ends
Apple's AI leadership transitions to John Ternus, marking a new era following Steve Jobs' vision and Tim Cook's operational success. This comes as Apple accelerates its generative AI push with Apple Intelligence.
Prefill-as-a-Service Paper Claims to Decouple LLM Inference Bottleneck
A research paper proposes a 'Prefill-as-a-Service' architecture to separate the heavy prefill computation from the lighter decoding phase in LLM inference. This could enable new deployment models where resource-constrained devices handle only the decoding step.
MLX-VLM Adds Continuous Batching, OpenAI API, and Vision Cache for Apple Silicon
The next release of MLX-VLM will introduce continuous batching, an OpenAI-compatible API, and vision feature caching for multimodal models running locally on Apple Silicon. These optimizations promise up to 228x speedups on cache hits for models like Gemma4.
Meta Expands Broadcom Partnership for Next-Gen AI Infrastructure
Meta is expanding its partnership with semiconductor giant Broadcom to co-develop its next-generation AI infrastructure. This move signals a continued, long-term commitment to custom silicon for AI training and inference.
Microsoft Raises Surface PC Prices Amid AI Copilot+ PC Push
Microsoft has implemented substantial price increases for its entire Surface PC portfolio. This move likely reflects the higher component and development costs associated with integrating next-generation AI capabilities into the Copilot+ PC platform.
MLX Enables Local Grounded Reasoning for Satellite, Security, Robotics AI
Apple's MLX framework is enabling 'local grounded reasoning' for AI applications in satellite imagery, security systems, and robotics, moving complex tasks from the cloud to on-device processing.
Google's Gemma 4B Model Runs on Nintendo Switch at 1.5 Tokens/Second
A developer successfully ran Google's 4-billion parameter Gemma language model on a Nintendo Switch, achieving 1.5 tokens/second inference. This demonstrates the increasing feasibility of running small LLMs on consumer-grade edge hardware.
Dell XPS 14 with Core Ultra X7 Outlasts Snapdragon X Elite in Battery Test
A new battery test shows the Dell XPS 14 with Intel Core Ultra X7 lasts 11.7 hours, beating the 11.3 hours of Microsoft's ARM-based Surface Laptop 15 with Snapdragon X Elite. This is a significant win for Intel's latest chip in the critical mobile performance metric.
ModelBest Hits $1B+ Valuation for On-Device Foundation Models
ModelBest, a Chinese developer of on-device AI foundation models, raised several hundred million RMB, reaching a valuation exceeding $1 billion. The funding will accelerate its push to deploy efficient models directly on smartphones and IoT devices.
Developer Ranks NPU Model Compilation Ease: Apple 1st, AMD Last
Developer @mweinbach ranked the ease of using AI coding agents to compile ML models for NPUs. Apple's ecosystem was rated easiest, while AMD's tooling was ranked most difficult.
X Post Reveals Audible Quality Differences in GPU vs. NPU AI Inference
A developer demonstrated audible quality differences in AI text-to-speech output when run on GPU, CPU, and NPU hardware, highlighting a key efficiency vs. fidelity trade-off for on-device AI.
Google's AICore Beta Enables On-Device Gemini Nano 4 Downloads for Android Phones
A new beta of Google's AICore system service enables users to download Gemini Nano 4 Full and Gemini Nano 4 Fast models directly onto compatible Android phones, including those with Snapdragon 8 Elite Gen 5 chips. This moves beyond pre-installed AI to user-initiated model management.
Apple M5 Max NPU Benchmarks 2x Faster Than Intel Panther Lake NPU in Parakeet v3 AI Inference Test
A leaked benchmark using the Parakeet v3 AI speech recognition model shows Apple's next-generation M5 Max Neural Processing Unit (NPU) delivering double the inference speed of Intel's competing Panther Lake NPU. This real-world test provides early performance data in the intensifying on-device AI hardware race.
Text-to-Speech Cost Plummets from $0.15/Word to Free Local Models Using 3GB RAM
High-quality text-to-speech has shifted from a $0.15 per word cloud service to free, local models requiring only 3GB of RAM in 12 months, signaling a broader price collapse in AI inference.
Apple's On-Device Reranking Model for Private Visual Search: A Technical Breakdown
Analysis of Apple's Enhanced Visual Search system that uses multimodal features, geo-signals, and index debiasing to identify landmarks entirely on-device. This represents a significant advancement in privacy-preserving AI for visual recognition.
Perplexity AI Launches On-Device Search Engine: Privacy-First AI Comes Home
A new privacy-first AI search engine called Perplexity AI now runs entirely on users' own hardware, eliminating cloud data transmission. This breakthrough represents a significant shift toward decentralized, secure AI processing that protects user queries from corporate surveillance.
Apple's Neural Engine Jailbroken: Researchers Unlock Full Training Capabilities on M-Series Chips
Security researchers have reverse-engineered Apple's Neural Engine, bypassing private APIs to enable full neural network training directly on ANE hardware. This breakthrough unlocks 15.8 TFLOPS of compute previously restricted to inference-only operations across all M-series devices.
Apple's M5 Pro and Max: Fusion Architecture Redefines AI Computing on Silicon
Apple unveils M5 Pro and M5 Max chips with groundbreaking Fusion Architecture, merging two 3nm dies into a single SoC. The chips deliver up to 30% faster CPU performance and over 4x peak GPU compute for AI workloads compared to previous generations.
Apple's Neural Engine Jailbroken: Researchers Unlock On-Device AI Training Capabilities
A researcher has reverse-engineered Apple's private Neural Engine APIs to enable direct transformer training on M-series chips, bypassing CoreML restrictions. This breakthrough could enable battery-efficient local model training and fine-tuning without cloud dependency.