OpenAI Codex Gains Screen Control, Long-Run Agents, and 90+ Plugins

OpenAI has upgraded Codex from a code-completion tool to an agentic macOS assistant that can see/click screens, run for weeks autonomously, and integrate with 90+ dev tools. This marks a strategic move into persistent, multi-modal coding agents.

AAAla SMITH & AI Research Desk·Apr 16, 2026·7 min read··296 views·AI-Generated·Report error

Source: the-decoder.comvia the_decoder, the_verge_tech, hacker_news_ml, @mweinbachMulti-Source

TL;DR

OpenAI's Codex can now autonomously control macOS apps, schedule tasks for weeks, and integrates image generation, directly challenging Anthropic's Claude Code.

OpenAI Codex Evolves into a Persistent macOS Coding Agent with Screen Control

OpenAI has released a major update for its developer tool Codex, transforming it from a code-completion engine into a persistent, agentic coding assistant capable of directly controlling a macOS desktop, scheduling long-running tasks, and integrating with over 90 development tools. The update, which includes a new "background computer use" feature, image generation, and autonomous operation, positions Codex as a direct competitor to Anthropic's recently launched Claude Code.

Key Takeaways

OpenAI has upgraded Codex from a code-completion tool to an agentic macOS assistant that can see/click screens, run for weeks autonomously, and integrate with 90+ dev tools.
This marks a strategic move into persistent, multi-modal coding agents.

What's New: From Code Helper to Desktop Agent

The core of the update is a shift in capability from a reactive tool to an active agent. Codex is no longer confined to an IDE or terminal.

Background Computer Use: Codex can now see the screen, move a cursor, click, and type to operate any application on a macOS computer. This allows it to interact with software that lacks an API, such as legacy desktop apps or certain design tools. Multiple Codex agents can run in parallel without interfering with the user's primary work.
Long-Run Autonomy & Scheduling: Codex can schedule tasks for itself and autonomously continue working on projects over "days or weeks." It can wake up to process new pull requests, monitor Slack/Gmail/Notion channels, or track tasks, functioning as a persistent background team member.
Integrated Browser & Image Generation: A built-in browser allows users to comment directly on web pages to instruct the agent, initially targeting front-end and game development workflows. Codex now integrates gpt-image-1.5 for generating product mockups, game graphics, and UI designs within the same context as code and screenshots.
Expanded Plugin Ecosystem: OpenAI has shipped over 90 new plugins, bundling skills, app integrations, and MCP (Model Context Protocol) servers. New integrations include Atlassian Rovo, JIRA, GitLab, the Microsoft Suite, and Slack, covering the full software development lifecycle.

Technical Details & Workflow Integration

The "background computer use" feature is currently exclusive to macOS. It operates by granting Codex system-level accessibility permissions to observe and generate UI events, similar to how automation scripts work but driven by an LLM's understanding of the screen state.

Image description

For development workflows, Codex now supports:

Editing GitHub review comments directly.
Running multiple terminal sessions concurrently.
Alpha-stage SSH connectivity to remote development boxes.
Reusing conversation threads to maintain context across sessions.

The combination of screen control, browser integration, and image generation creates a closed loop for front-end development: an engineer can ask Codex to "implement the login page from this Figma mockup," and the agent can generate a base image, write the corresponding React code, and then test the rendered UI in a browser—all without the developer manually switching contexts.

How It Compares: The Agentic Coding Landscape

This update is a direct competitive response to the landscape of AI coding assistants moving beyond autocomplete. The primary competitor is Anthropic's Claude Code, which also emphasizes autonomous task execution and deep IDE integration.

Desktop Control Yes (macOS) Limited (via IDE) No Autonomous Operation Days/weeks, self-scheduling Session-based, task-oriented No (inline only) Multi-Modal Yes (gpt-image-1.5) No No Plugin/Integration Count 90+ new plugins Growing ecosystem Extensive via GitHub Marketplace Core Model Presumed GPT-4o/4.5 series Claude 3.5 Sonnet OpenAI & Microsoft models

Codex's distinct advantage is its persistent agency and direct screen manipulation, which allows it to operate in environments without pre-built APIs. Claude Code currently focuses on deep, reasoning-based code changes within a controlled IDE or CLI environment.

What to Watch: Limitations and Strategic Implications

The initial macOS-only limitation for screen control is significant, excluding the large Windows developer base. Security and privacy concerns around an "always-on" agent with screen access will require careful enterprise rollout. Performance and cost of running persistent agents for weeks on complex tasks are unstated.

Strategically, this moves Codex from being a component (the model behind GitHub Copilot) to a standalone, multi-modal agentic platform. It leverages OpenAI's strengths in vision (GPT-4V) and tool use to create a unified assistant that can navigate the messy, multi-app reality of software development. The expansion into image generation also subtly competes with services like Midjourney and DALL-E APIs by embedding it into a developer-centric workflow.

gentic.news Analysis

This update is a pivotal evolution in OpenAI's product strategy, reflecting two major industry trends we've been tracking. First, it confirms the shift from tools to agents—a trend we detailed in our analysis of Devin's launch and the rise of AI software engineers. Codex is no longer just suggesting the next line; it's taking over entire sub-processes. Second, it represents a vertical integration play within OpenAI's own stack. Instead of just providing the foundational model (GPT) for others to build coding assistants, OpenAI is now building the flagship agent product itself, potentially creating tension with partners like GitHub/Microsoft who build Copilot on top of OpenAI models.

The focus on macOS first is telling. It targets the high-value segment of professional developers, particularly in mobile (iOS) and startup ecosystems, where macOS dominates. This follows a pattern of premium, vertical-first launches before broader rollout. The direct competition with Anthropic's Claude Code is now explicit and head-to-head. Both companies are betting that the future of coding assistance lies in persistent, reasoning agents that manage context across days, not milliseconds. The integration of over 90 plugins in one move shows OpenAI is aggressively pursuing ecosystem lock-in, attempting to make Codex the central orchestration layer for the entire dev toolchain.

Frequently Asked Questions

Can OpenAI Codex see everything on my screen?

Yes, the "background computer use" feature requires granting Codex accessibility permissions on macOS, allowing it to observe the screen content to navigate and control applications. This is necessary for it to click buttons, type in fields, and understand UI states in apps without APIs.

How does Codex's autonomy work? Can it code for days without me?

According to OpenAI, Codex can schedule itself for future tasks and wake up autonomously to continue work on long-term projects, potentially operating across days or weeks. For example, you could ask it to "monitor this GitHub repo, review every new pull request, and suggest changes," and it would persist as a background process performing that duty.

Is this the same Codex model from 2021?

No. While the product retains the "Codex" name, its capabilities have been radically expanded. The original Codex (powering GitHub Copilot) was a fine-tuned GPT-3 model for code completion. The current system is almost certainly built on a much more advanced model like GPT-4o or GPT-4.5, with integrated vision, tool-use, and long-context reasoning capabilities enabling its new agentic behavior.

What are the main differences between Codex and Claude Code?

The primary differences are in the approach to agency. Codex employs direct screen control (macOS) and can run for weeks autonomously, acting like a persistent background worker. Claude Code, while highly capable at complex code reasoning and edits, operates more within a defined IDE/CLI session. Codex also integrates image generation directly, while Claude Code remains text/code-focused.

Sources cited in this article

OpenAI

Source: gentic.news · Apr 16, 2026 · author=Ala SMITH · citation.json

AI-assisted reporting. Generated by gentic.news from 1 verified source, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

This update is less about a raw improvement in code generation accuracy and more about a fundamental shift in the **interface and agency** of AI coding tools. The technical leap is the integration of **visual perception** (via the screen) with **planning and tool-use** to operate an arbitrary GUI environment. This turns the entire desktop OS into an API for the LLM, a concept moving from research (see projects like "OS-Copilot") to a shipped product. Practitioners should note the implications for **evaluation**. Benchmarking Codex now requires metrics beyond SWE-Bench or HumanEval (which test code generation in isolation). New benchmarks will need to assess **sequential task completion** in a GUI environment, **cross-application workflow orchestration**, and **long-horizon planning** reliability. The security model is also novel: an agent with persistent screen access creates a new attack surface for prompt injection, where a malicious website or document could potentially instruct the agent to perform unwanted actions on the user's system. This release also signals OpenAI's continued push towards **multi-modal, agentic foundation models**. The same core capabilities powering Codex—vision, tool-use, long-context planning—are the building blocks for generalist agents in other domains. We are likely seeing the application of a unified agent platform that will soon manifest in other OpenAI products beyond developer tools.

#product launch #software development #ai agents #openai

Compare side-by-side

OpenAI vs Anthropic

→

Mentioned in this article

OpenAI Codex 5.3 Anthropic Claude Code macOS

Enjoyed this article?