update

30 articles about update in AI news

239-Paper Survey Maps How AI Agents Self-Improve via Scaffold Updates

A survey of 239 papers shows 68% of AI agent self-improvement methods focus on scaffold updates rather than model retraining, raising evaluation quality concerns.

Jul 19, 202685% relevant

Codex Update Cuts GUI Workflow Latency 42%

Codex app update cuts GUI workflow latency 42%, enabling near-human-speed interface operation for autonomous app building and debugging.

May 1, 202684% relevant

OpenAI Codex Update Adds macOS Agent, Browser, Memory; 3M Weekly Users

OpenAI released a major Codex update featuring background macOS automation, an in-app browser, persistent memory, and 90+ plugins. With 3M weekly users and nearly half of usage now non-coding, Codex is being repositioned as a general work agent.

Apr 16, 2026100% relevant

Anthropic Launches @ClaudeDevs X Account for API Developer Updates

Anthropic has launched @ClaudeDevs on X, a new channel for developers to receive direct updates on API releases, changelogs, and community news. This formalizes a direct line of communication for its growing developer ecosystem.

Apr 16, 202675% relevant

Kimi 2.6 Code Model Teased in Leaked Image, Suggesting Moonshot AI Update

A screenshot circulating online appears to show a 'Kimi 2.6' code model interface, suggesting Moonshot AI is preparing an update to its Kimi Chat platform focused on coding tasks.

Apr 13, 202685% relevant

Rumor: Anthropic's Next Claude Update May Include AI App Builder

A rumor on X claims the next Claude update will include an app builder, allowing users to create applications through conversational AI. This could significantly lower the barrier to app development.

Apr 13, 202687% relevant

How to Decode Anthropic's Press Releases for Better Claude Code Updates

Claude Code users should learn to filter Anthropic's technical announcements for actionable updates on model capabilities, context windows, and API pricing that affect daily development.

Apr 8, 202697% relevant

Google's Cookie Policy Update and the Challenge of AI-Powered Personalization

Google has updated its user-facing cookie and data consent interface, emphasizing its use of data for personalization and ad measurement. This reflects the ongoing tension between data-driven AI services and user privacy, a critical issue for luxury retail's digital transformation.

Apr 2, 202682% relevant

TensorFlow Playground Interactive Demo Updated for 2026, Enabling Real-Time Neural Network Visualization

The TensorFlow Playground, an educational web tool for visualizing neural networks, has been updated. Users can now adjust hyperparameters and watch the model train and visualize decision boundaries in real-time.

Mar 31, 202685% relevant

New Research Diagnoses LLMs' Struggle with Multiple Knowledge Updates in Context

A new arXiv paper reveals a persistent bias in LLMs when facts are updated multiple times within a long context. Models increasingly favor the earliest version, failing to track the latest state—a critical flaw for dynamic knowledge tasks.

Mar 16, 202678% relevant

Claude 3.5 Sonnet's Latest Update Redefines AI Agent Capabilities for Real-World Tasks

Anthropic's Claude 3.5 Sonnet 4.6 update demonstrates remarkable improvements in agentic workflows and computer interaction, positioning it as a leading model for practical AI applications. Early adopters report unprecedented efficiency in real-world task automation.

Feb 17, 202685% relevant

Almanac: Open-Source Wiki Auto-Updates From Claude Code Chats

Almanac auto-generates a markdown wiki from Claude Code chats and repo history, solving the agent context gap. Free open-source tool, MacOS-only.

May 14, 202690% relevant

Newline's 'Skills' Update Shows Where MCP Servers Are Headed

The Newline MCP server now supports modular 'Skills,' allowing developers to customize their Claude Code environment with specific, installable capabilities for more targeted workflows.

Apr 13, 2026100% relevant

Alibaba's Qwen Team Teases Qwen 3.6 Model, Signaling Major Open-Source LLM Update

Alibaba's Qwen team has teased the imminent release of Qwen 3.6, the next major version of its open-source large language model series. This follows the release of Qwen 2.5 in late 2024 and signals continued aggressive competition in the open-weight model space.

Mar 30, 202685% relevant

AI Shopping Update: OpenAI Focuses on Discovery, Meta Launches Checkout & Shopify Offers Catalog Integration

A trio of major AI shopping announcements: OpenAI shifts focus to product discovery, Meta launches in-app checkout for AI shopping ads, and Shopify opens its catalog integration to any brand. This signals a rapid move from conversational AI to transactional agentic systems.

Mar 25, 202695% relevant

Memento-Skills Agent System Achieves 116.2% Relative Improvement on Humanity's Last Exam Without LLM Updates

Memento-Skills is a generalist agent system that autonomously constructs and adapts task-specific agents through experience. It enables continual learning without updating LLM parameters, achieving 26.2% and 116.2% relative improvements on GAIA and Humanity's Last Exam benchmarks.

Mar 22, 202685% relevant

Smartsheet Deploys Remote MCP Server on AWS for Agent Tool Access

Smartsheet deployed a remote MCP server on AWS ECS Fargate, enabling Claude agents to query and update sheets via standardized tool calls without a local client.

Jul 17, 2026100% relevant

Aletheia: An Open-Source Uncertainty Agent That Earns Its Confidence in

Aletheia is an open-source uncertainty loop agent for Claude Code that uses belief-update over guess-and-summarize, delivering verdicts with explicit confidence and residual unknowns.

Jul 5, 202680% relevant

Agent Publish Primitives: Why Default-Private MCP Tools Beat Raw CDN URLs

Thryvate argues AI agents need five design properties for safe web publishing: default-private, revocable, expiring, per-viewer analytics, and idempotent updates. MCP tools enforce policy while the model handles intent.

Jun 27, 202675% relevant

GitHub Actions Now Runs Steps in Parallel — Here's How to Use It with

GitHub Actions' new `background`, `wait`, `cancel`, and `parallel` keywords let you run steps concurrently. Update your CI/CD workflows to cut job times.

Jun 25, 202670% relevant

OpenAI GPT-5.5-Cyber Beats Anthropic Mythos on Security Benchmarks

OpenAI's GPT-5.5-Cyber beats Anthropic's Mythos on security benchmarks. Updated Codex plugin auto-patches after scanning 30M commits.

Jun 23, 2026100% relevant

AMD's Lemonade v10.8 Adds MCP Support, Letting Claude Desktop and Cursor Route Tasks to Local AMD GPUs

AMD-backed Lemonade v10.8, released June 17, now exposes a Model Context Protocol server, letting Claude Desktop, Cursor, and GitHub Copilot route inference tasks to local AMD Ryzen AI NPUs, Radeon GPUs, or plain CPUs — no cloud API required. The update also adds Moonshine speech-to-text, expanded R

Jun 17, 202670% relevant

ReCast: A New RL Technique That Fixes Sparse-Hit Learning in Generative

Researchers propose ReCast, a 'repair-then-contrast' framework that fixes a fundamental flaw in group-based RL for generative recommendation: many sampled groups never become learnable. ReCast restores learnability for zero-reward groups and replaces normalization with contrastive updates, achieving up to 36.6% improvement in Pass@1 and 16.6x faster actor updates.

Apr 27, 202684% relevant

Fanuc robot arms combine AI and computer vision to adopt flexible workflows

Fanuc has updated its robot arms with AI and computer vision, enabling them to handle flexible workflows rather than fixed, repetitive tasks. This shift allows for greater adaptability in manufacturing environments.

Apr 20, 202674% relevant

Claude Code Security Alert: Patch Now, Stop Using Authentication Helpers

A critical security leak reveals three command injection vulnerabilities in Claude Code. Users must update and stop using authentication helpers to prevent credential theft and supply chain attacks.

Apr 20, 2026100% relevant

OpenAI Launches GPT-Rosalind for Drug Discovery, GPT-5.4-Cyber for Security

OpenAI launched GPT-Rosalind, a life sciences model performing above the 95th percentile of human experts on novel biological data, and GPT-5.4-Cyber, a cybersecurity variant. These releases, alongside a major Agents SDK update, signal a pivot from general AI to specialized, high-stakes enterprise domains.

Apr 20, 202690% relevant

Opus 4.7's Tokenizer Change: How to Measure Your Real Claude Code Costs

Claude Opus 4.7's updated tokenizer means the same input can cost 40%+ more than 4.6. Use the Claude Token Counter to measure real costs before upgrading.

Apr 20, 2026100% relevant

Claude AI Adds Meal Planning Feature, Aims at Nutritionist Market

Anthropic's Claude AI assistant has been updated to create detailed weekly meal plans tailored to user-defined nutrition targets. This feature expansion moves Claude into the health and wellness productivity space, competing with specialized apps.

Apr 19, 202685% relevant

GPT-5.5 Limited Rollout Begins, Frontend Improvements Noted

OpenAI has started a limited rollout of GPT-5.5 to select users, with early reports highlighting significant frontend quality improvements. This suggests an incremental update focused on user experience rather than core model capabilities.

Apr 19, 202685% relevant

Autogenesis Protocol Enables Self-Evolving AI Agents Without Retraining

A new paper introduces Autogenesis, a self-evolving agent protocol. Agents can assess their own shortcomings, propose and test improvements, and update their operational framework in a continuous loop.

Apr 17, 202689% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety