OpenAI has released a major update for its developer tool Codex, transforming it from a code-completion engine into a persistent, agentic coding assistant capable of directly controlling a macOS desktop, scheduling long-running tasks, and integrating with over 90 development tools. The update, which includes a new "background computer use" feature, image generation, and autonomous operation, positions Codex as a direct competitor to Anthropic's recently launched Claude Code.
What's New: From Code Helper to Desktop Agent
The core of the update is a shift in capability from a reactive tool to an active agent. Codex is no longer confined to an IDE or terminal.
- Background Computer Use: Codex can now see the screen, move a cursor, click, and type to operate any application on a macOS computer. This allows it to interact with software that lacks an API, such as legacy desktop apps or certain design tools. Multiple Codex agents can run in parallel without interfering with the user's primary work.
- Long-Run Autonomy & Scheduling: Codex can schedule tasks for itself and autonomously continue working on projects over "days or weeks." It can wake up to process new pull requests, monitor Slack/Gmail/Notion channels, or track tasks, functioning as a persistent background team member.
- Integrated Browser & Image Generation: A built-in browser allows users to comment directly on web pages to instruct the agent, initially targeting front-end and game development workflows. Codex now integrates gpt-image-1.5 for generating product mockups, game graphics, and UI designs within the same context as code and screenshots.
- Expanded Plugin Ecosystem: OpenAI has shipped over 90 new plugins, bundling skills, app integrations, and MCP (Model Context Protocol) servers. New integrations include Atlassian Rovo, JIRA, GitLab, the Microsoft Suite, and Slack, covering the full software development lifecycle.
Technical Details & Workflow Integration
The "background computer use" feature is currently exclusive to macOS. It operates by granting Codex system-level accessibility permissions to observe and generate UI events, similar to how automation scripts work but driven by an LLM's understanding of the screen state.

For development workflows, Codex now supports:
- Editing GitHub review comments directly.
- Running multiple terminal sessions concurrently.
- Alpha-stage SSH connectivity to remote development boxes.
- Reusing conversation threads to maintain context across sessions.
The combination of screen control, browser integration, and image generation creates a closed loop for front-end development: an engineer can ask Codex to "implement the login page from this Figma mockup," and the agent can generate a base image, write the corresponding React code, and then test the rendered UI in a browser—all without the developer manually switching contexts.
How It Compares: The Agentic Coding Landscape
This update is a direct competitive response to the landscape of AI coding assistants moving beyond autocomplete. The primary competitor is Anthropic's Claude Code, which also emphasizes autonomous task execution and deep IDE integration.
Desktop Control Yes (macOS) Limited (via IDE) No Autonomous Operation Days/weeks, self-scheduling Session-based, task-oriented No (inline only) Multi-Modal Yes (gpt-image-1.5) No No Plugin/Integration Count 90+ new plugins Growing ecosystem Extensive via GitHub Marketplace Core Model Presumed GPT-4o/4.5 series Claude 3.5 Sonnet OpenAI & Microsoft modelsCodex's distinct advantage is its persistent agency and direct screen manipulation, which allows it to operate in environments without pre-built APIs. Claude Code currently focuses on deep, reasoning-based code changes within a controlled IDE or CLI environment.
What to Watch: Limitations and Strategic Implications
The initial macOS-only limitation for screen control is significant, excluding the large Windows developer base. Security and privacy concerns around an "always-on" agent with screen access will require careful enterprise rollout. Performance and cost of running persistent agents for weeks on complex tasks are unstated.
Strategically, this moves Codex from being a component (the model behind GitHub Copilot) to a standalone, multi-modal agentic platform. It leverages OpenAI's strengths in vision (GPT-4V) and tool use to create a unified assistant that can navigate the messy, multi-app reality of software development. The expansion into image generation also subtly competes with services like Midjourney and DALL-E APIs by embedding it into a developer-centric workflow.
gentic.news Analysis
This update is a pivotal evolution in OpenAI's product strategy, reflecting two major industry trends we've been tracking. First, it confirms the shift from tools to agents—a trend we detailed in our analysis of Devin's launch and the rise of AI software engineers. Codex is no longer just suggesting the next line; it's taking over entire sub-processes. Second, it represents a vertical integration play within OpenAI's own stack. Instead of just providing the foundational model (GPT) for others to build coding assistants, OpenAI is now building the flagship agent product itself, potentially creating tension with partners like GitHub/Microsoft who build Copilot on top of OpenAI models.
The focus on macOS first is telling. It targets the high-value segment of professional developers, particularly in mobile (iOS) and startup ecosystems, where macOS dominates. This follows a pattern of premium, vertical-first launches before broader rollout. The direct competition with Anthropic's Claude Code is now explicit and head-to-head. Both companies are betting that the future of coding assistance lies in persistent, reasoning agents that manage context across days, not milliseconds. The integration of over 90 plugins in one move shows OpenAI is aggressively pursuing ecosystem lock-in, attempting to make Codex the central orchestration layer for the entire dev toolchain.
Frequently Asked Questions
Can OpenAI Codex see everything on my screen?
Yes, the "background computer use" feature requires granting Codex accessibility permissions on macOS, allowing it to observe the screen content to navigate and control applications. This is necessary for it to click buttons, type in fields, and understand UI states in apps without APIs.
How does Codex's autonomy work? Can it code for days without me?
According to OpenAI, Codex can schedule itself for future tasks and wake up autonomously to continue work on long-term projects, potentially operating across days or weeks. For example, you could ask it to "monitor this GitHub repo, review every new pull request, and suggest changes," and it would persist as a background process performing that duty.
Is this the same Codex model from 2021?
No. While the product retains the "Codex" name, its capabilities have been radically expanded. The original Codex (powering GitHub Copilot) was a fine-tuned GPT-3 model for code completion. The current system is almost certainly built on a much more advanced model like GPT-4o or GPT-4.5, with integrated vision, tool-use, and long-context reasoning capabilities enabling its new agentic behavior.
What are the main differences between Codex and Claude Code?
The primary differences are in the approach to agency. Codex employs direct screen control (macOS) and can run for weeks autonomously, acting like a persistent background worker. Claude Code, while highly capable at complex code reasoning and edits, operates more within a defined IDE/CLI session. Codex also integrates image generation directly, while Claude Code remains text/code-focused.









