OmniGlass: When AI Stops Talking and Starts Doing
For months, developers have been experiencing a peculiar form of AI frustration: you show Claude Desktop or Cursor a Python traceback, the AI correctly identifies the missing pandas installation, and then... you're left typing pip install pandas yourself. If the AI knows the solution, why are we still doing the manual work?
This exact friction point led to the creation of OmniGlass, an open-source tool that represents a fundamental shift in how AI interacts with our desktop environments. Instead of generating chat responses, OmniGlass reads your screen, understands context, and presents executable actions—all while implementing security measures that address growing concerns about AI plugin vulnerabilities.
From Observation to Execution
The core innovation of OmniGlass isn't in its AI capabilities—it's in what happens after the AI processes information. While tools like Claude Desktop (mentioned 35 times in our coverage) excel at understanding and describing what's on your screen, they stop at generating text responses. OmniGlass takes the next logical step: execution.
The workflow is elegantly simple:
- Screen Selection: Draw a box around any screen content
- Local OCR Processing: Apple Vision OCR extracts text without sending data to the cloud
- AI Classification: An LLM (Claude Haiku, Gemini Flash, or local Qwen-2.5) identifies what you're looking at
- Action Menu: Instead of a chat response, you get context-specific executable options
Current capabilities demonstrate the practical applications:
- Python tracebacks: Generates and runs the fix command after user confirmation
- Data tables: Opens a native save dialog and exports clean CSV files
- Slack bug reports: Drafts GitHub issues with all context automatically filled
- Menu bar input: Type plain English to trigger appropriate system commands
The Security Revolution: Kernel-Level Sandboxing
What makes OmniGlass particularly noteworthy is its approach to security—a topic creator goshtasb identifies as "the elephant in the room" that "nobody is really talking about yet."
Most AI tools with plugin systems, including Claude Desktop, run plugins with full user permissions. This creates significant vulnerabilities: a rogue plugin (or clever prompt injection) could access SSH keys, scrape .env files, and exfiltrate sensitive data.
OmniGlass addresses this through macOS kernel-level sandboxing using sandbox-exec. Every plugin operates within strict boundaries:
- Complete isolation: The /Users/ directory is completely walled off
- Filtered environment variables: Only safe variables are exposed
- Manual command confirmation: Shell commands require explicit user approval
- Local processing: Optional fully-local operation via llama.cpp (6-second end-to-end processing)
This security architecture enables what goshtasb describes as the ability to "run community plugins without sweating about what they can access"—a crucial advancement as AI tools increasingly integrate with sensitive systems.
Technical Architecture: Rust, Tauri, and MCP
OmniGlass leverages modern technologies to balance performance, security, and usability:
Frontend/Backend: Built with Tauri (Rust + TypeScript), combining Rust's memory safety with web technologies for the interface
Vision Processing: Uses Apple Vision OCR locally, avoiding cloud dependencies and privacy concerns
Plugin System: Implements MCP (Model Context Protocol) over stdio, enabling extensibility while maintaining security boundaries
Model Flexibility: Supports cloud models (Claude Haiku, Gemini Flash) or fully local operation via llama.cpp with Qwen-2.5
The choice of Rust is particularly significant given the security requirements, as its memory safety guarantees complement the sandboxing approach.
Implications for AI-Assisted Development
OmniGlass represents more than just another productivity tool—it signals a maturation in how we conceptualize AI assistance. While Anthropic's Claude models (mentioned 21 times for Claude AI specifically) have demonstrated remarkable understanding capabilities since their March 2023 debut, the execution layer has remained largely manual.
This development suggests several trends:
From Advisory to Operational AI: Tools are evolving from suggesting actions to performing them within controlled parameters
Security-First AI Design: As AI systems gain more system access, security considerations must move from afterthought to foundational design principle
Contextual Intelligence: The combination of screen understanding with executable actions creates genuinely intelligent assistance rather than just reactive responses
Local Processing Renaissance: The option for fully local operation (6-second processing with Qwen-2.5) addresses both privacy concerns and latency issues
The Future of AI Desktop Integration
What makes OmniGlass particularly compelling is its plugin architecture. By implementing MCP over stdio with strict sandboxing, it creates a platform where developers can build specialized capabilities without compromising security. This could lead to an ecosystem of secure, executable AI plugins for everything from database management to system administration.
The tool also highlights an important tension in AI development: as capabilities increase, so do risks. OmniGlass's approach—giving AI tools just enough access to be useful but not enough to be dangerous—may become a model for future AI system design.
Currently in early development ("I just shipped our second working plugin"), OmniGlass is open source and available on GitHub, inviting community participation in shaping what secure, executable AI assistance should look like.
As AI continues to integrate deeper into our workflows, tools like OmniGlass don't just make us more efficient—they redefine the relationship between human intention and machine execution, all while addressing the security concerns that will determine whether these integrations succeed or fail.


