on device
30 articles about on device in AI news
Roboflow's RF-DETR Model Ported to Apple MLX, Enabling Real-Time On-Device Instance Segmentation
Roboflow's RF-DETR object detection model is now available on Apple's MLX framework, enabling real-time instance segmentation on Apple Silicon devices. This port unlocks new on-device visual analysis applications for robotics and augmented vision-language models.
OpenMedKit Adds GLiNER for On-Device PII Detection on iPhone
OpenMedKit is adding the GLiNER zero-shot named entity recognition framework to its toolkit, expanding its on-device, privacy-preserving PII detection capabilities for healthcare data on iPhones.
HUOZIIME: A Research Framework for On-Device LLM-Powered Input Methods
A new research paper introduces HUOZIIME, a personalized on-device input method powered by a lightweight LLM. It uses a hierarchical memory mechanism to capture user-specific input history, enabling privacy-preserving, real-time text generation tailored to individual writing styles.
ModelBest Hits $1B+ Valuation for On-Device Foundation Models
ModelBest, a Chinese developer of on-device AI foundation models, raised several hundred million RMB, reaching a valuation exceeding $1 billion. The funding will accelerate its push to deploy efficient models directly on smartphones and IoT devices.
Ethan Mollick: Gemma 4 Impressive On-Device, But Agentic Workflows Doubted
Wharton professor Ethan Mollick finds Google's Gemma 4 powerful for on-device use but is skeptical about its ability to execute true agentic workflows, citing limitations in judgment and self-correction.
Apple's On-Device Reranking Model for Private Visual Search: A Technical Breakdown
Analysis of Apple's Enhanced Visual Search system that uses multimodal features, geo-signals, and index debiasing to identify landmarks entirely on-device. This represents a significant advancement in privacy-preserving AI for visual recognition.
Apple Reportedly Gains Full Internal Access to Google's Gemini for On-Device Model Distillation
A report claims Apple's AI deal with Google includes full internal model access, enabling distillation of Gemini's reasoning into smaller, on-device models. This would allow Apple to build specialized, efficient AI without relying solely on cloud APIs.
KAIST Develops 'SoulMate' AI Chip for Real-Time, On-Device Personalization
KAIST researchers have developed a new AI semiconductor, 'SoulMate,' that enables real-time, on-device learning of user habits and preferences. The chip combines RAG and LoRA for instant personalization while consuming minimal power, aiming for commercialization by 2027.
Stanford's OpenJarvis: The Open-Source Framework Bringing Personal AI Agents to Your Device
Stanford researchers have released OpenJarvis, an open-source framework for building personal AI agents that operate entirely on-device. This local-first approach prioritizes privacy and autonomy while providing tools, memory, and learning capabilities.
Open-Source Project Unlocks Apple's On-Device AI for Any Device on Your Network
Perspective Intelligence Web, an open-source project, enables any device with a browser to access Apple's powerful on-device AI models running locally on a Mac. This MIT-licensed solution addresses privacy concerns by keeping all processing on your private network while extending Apple Intelligence capabilities to Windows, Linux, Android, and Chromebook devices.
Edge AI Breakthrough: Qwen3.5 2B Runs Locally on iPhone 17 Pro, Redefining On-Device Intelligence
Alibaba's Qwen3.5 2B model now runs locally on iPhone 17 Pro devices, marking a significant breakthrough in edge AI. This development enables sophisticated language processing without cloud dependency, potentially transforming mobile AI applications and user privacy paradigms.
Google's AI Edge Gallery Arrives on iPhone: A Privacy-First Revolution in On-Device Intelligence
Google AI Edge Gallery has launched on iOS, bringing true on-device function calling to iPhones for the first time. Powered by the compact 270M parameter FunctionGemma model, it enables natural voice commands to trigger phone actions like calendar events and flashlight toggles—completely offline.
Google's AICore Beta Enables On-Device Gemini Nano 4 Downloads for Android Phones
A new beta of Google's AICore system service enables users to download Gemini Nano 4 Full and Gemini Nano 4 Fast models directly onto compatible Android phones, including those with Snapdragon 8 Elite Gen 5 chips. This moves beyond pre-installed AI to user-initiated model management.
Apple's Private Cloud Compute: Leak Suggests 4x M2 Ultra Cluster for On-Device AI Offload
A leak suggests Apple's Private Cloud Compute for AI may be built on clusters of four M2 Ultra chips, potentially offering high-performance, private server-side processing for iPhone AI tasks. This would mark Apple's strategic move into dedicated, privacy-focused AI infrastructure.
Perplexity AI Launches On-Device Search Engine: Privacy-First AI Comes Home
A new privacy-first AI search engine called Perplexity AI now runs entirely on users' own hardware, eliminating cloud data transmission. This breakthrough represents a significant shift toward decentralized, secure AI processing that protects user queries from corporate surveillance.
The Laptop Agent Revolution: How 24B-Parameter Models Are Redefining On-Device AI
Liquid's LFM2-24B-A2B model runs locally on laptops, selecting tools in under 400ms. Its hybrid architecture enables sparse activation, making powerful AI agents practical for regulated industries and developers without cloud dependencies.
Apple's Neural Engine Jailbroken: Researchers Unlock On-Device AI Training Capabilities
A researcher has reverse-engineered Apple's private Neural Engine APIs to enable direct transformer training on M-series chips, bypassing CoreML restrictions. This breakthrough could enable battery-efficient local model training and fine-tuning without cloud dependency.
MIT Hackathon Team Builds Wearable AI for Physical Movement Guidance
MIT hackathon team builds wearable AI for real-time physical movement guidance via sensors and on-device inference, demoed by @kimmonismus.
Developer Achieves 395x RTFx on M5 Max with Fastest Parakeet v3 for Apple ANE
Developer @mweinbach has optimized the Parakeet v3 speech recognition model for Apple's Neural Engine, achieving a 395x real-time factor on an M5 Max chip. This represents a significant performance leap for on-device AI inference on Apple Silicon.
AirTrain Enables Distributed ML Training on MacBooks Over Wi-Fi
Developer @AlexanderCodes_ open-sourced AirTrain, a tool that enables distributed ML training across Apple Silicon MacBooks using Wi-Fi by syncing gradients every 500 steps instead of every step. This makes personal device training feasible for models up to 70B parameters without cloud GPU costs.
7 Free GitHub Repos for Running LLMs Locally on Laptop Hardware
A developer shared a list of seven key GitHub repositories, including AnythingLLM and llama.cpp, that allow users to run LLMs locally without cloud costs. This reflects the growing trend of efficient, private on-device AI inference.
MLX Enables Local Grounded Reasoning for Satellite, Security, Robotics AI
Apple's MLX framework is enabling 'local grounded reasoning' for AI applications in satellite imagery, security systems, and robotics, moving complex tasks from the cloud to on-device processing.
Technical Implementation: Building a Local Fine-Tuning Engine with MLX
A developer shares a backend implementation guide for automating the fine-tuning process of AI models using Apple's MLX framework. This enables private, on-device model customization without cloud dependencies, which is crucial for handling sensitive data.
AI Model Decodes Silent Speech from Phone Sensors, No Microphone Needed
A new AI model can reconstruct speech by analyzing imperceptible facial movements captured by smartphone sensors, effectively enabling silent speech recognition without a microphone. This represents a significant leap in sensor fusion and on-device AI.
Efficient Universal Perception Encoder (EUPE) Family Challenges DINOv2
Researchers introduced the Efficient Universal Perception Encoder (EUPE), a family of compact vision models that achieve performance rivaling the larger DINOv2. This could enable high-quality visual understanding on resource-constrained devices.
Gemma 4 Ported to MLX-Swift, Runs Locally on Apple Silicon
Google's Gemma 4 language model has been ported to the MLX-Swift framework by a community developer, making it available for local inference on Apple Silicon Macs and iOS devices through the LocallyAI app.
Google DeepMind Unveils Next-Generation AI Tools and Android XR Platform at I/O 2024
Google's I/O 2024 keynote featured significant AI announcements from Google DeepMind, including new Gemini-powered tools and the official unveiling of Android XR. The extended reality operating system, developed in partnership with Samsung, represents a major expansion of Google's AI ecosystem into wearable devices.
Google Releases Fully Open-Source Gemma 4 AI Model for Local Device Deployment
Google has launched Gemma 4, a fully open-source AI model family available under the Apache 2.0 license. The release marks Google's re-entry into the competitive open-source AI landscape with models optimized for local deployment, including on mobile devices.
Storing Less, Finding More: Novelty Filtering Architecture for Cross-Modal Retrieval on Edge Cameras
A new streaming retrieval architecture uses an on-device 'epsilon-net' filter to retain only semantically novel video frames, dramatically improving cross-modal search accuracy while reducing power consumption to 2.7 mW. This addresses the fundamental problem of redundant frames crowding out correct results in continuous video streams.
Facebook's SAM 3 Vision Model Ported to Apple's MLX Framework, Enabling Real-Time Tracking on M3 Max
Facebook's Segment Anything Model 3 (SAM 3) has been ported to Apple's MLX framework, enabling real-time object tracking on an M3 Max MacBook Pro. This demonstrates efficient on-device execution of a foundational vision model without cloud dependency.