product preview
30 articles about product preview in AI news
Claude Mythos Preview Doubles METR Time Horizon at 80% Success
Claude Mythos Preview snapshot achieves 2x METR time horizon over next best model at 80% success rate, per Anthropic. Absolute numbers undisclosed.
Codex 'Chronicle' Research Preview Adds Memory for Daily Developer Context
A research preview of 'Chronicle' for Codex has been released. It enables the AI coding assistant to accumulate memories from a developer's daily workflow to improve context.
Shopify Engineering Teases 'Autoresearch' Beyond Model Training in 2026 Preview
Shopify Engineering has previewed a 2026 perspective suggesting 'autoresearch'—automated research processes—will have applications extending beyond just training AI models. This signals a broader operational automation strategy for the e-commerce giant.
How to Manage Multiple Claude Code Sessions with Harness and Preview
Two actionable tools to solve the core productivity bottlenecks when running multiple Claude Code agents: session management and review speed.
Anthropic Delays Mythos Preview, Offers Early Access to Defenders
Anthropic is delaying the general availability of its 'Mythos Preview' model. Instead, it is granting early, controlled access to security-focused 'defenders' to finalize safety measures.
Agentic AI Systems Failing in Production: New Research Reveals Benchmark Gaps
New research reveals that agentic AI systems are failing in production environments in ways not captured by current benchmarks, including alignment drift and context loss during handoffs between agents.
Qwen 3.6 Plus Preview Launches on OpenRouter with Free 1M Token Context, Disrupting API Pricing
Alibaba's Qwen team has released a preview of Qwen 3.6 Plus on OpenRouter with a 1 million token context window, charging $0 for both input and output tokens. This directly undercuts paid long-context offerings from Anthropic and OpenAI.
Claude Code's Hidden Token Cap: How to Work Around It and Stay Productive
Anthropic is silently reducing effective context window via token inflation. Here's how Claude Code users can adapt their workflows to maintain productivity.
Anthropic Launches Claude Code Auto Mode Preview, a Safety Classifier to Prevent Mass File Deletions
Anthropic is previewing 'auto mode' for Claude Code, a classifier that autonomously executes safe actions while blocking risky ones like mass deletions. The feature, rolling out to Team, Enterprise, and API users, follows high-profile incidents like a recent AWS outage linked to an AI tool.
OpenAI Renames Product Org to 'AGI Deployment', Sam Altman Teases 'Very Strong' Upcoming Model 'Spud'
OpenAI has renamed its product organization to 'AGI Deployment' and CEO Sam Altman has teased a 'very strong' upcoming model called 'Spud' that could 'accelerate the economy.' The moves signal a confident, aggressive push toward artificial general intelligence.
How to Prevent Claude Code from Deleting Production Data: The Critical --dry-run Flag
A critical bug report shows Claude Code can delete production databases. Use `--dry-run` and explicit path exclusions in CLAUDE.md immediately.
Beyond the Racket: How AI-Powered Exclusive Previews Are Redefining Luxury Event Marketing
BMW's use of a closed-room, AI-enhanced model preview at the BNP Paribas Open demonstrates a new paradigm for luxury marketing. This approach creates scarcity, personalizes the high-touch experience, and generates ultra-qualified leads by blending physical exclusivity with data-driven engagement.
NVIDIA GTC 2025 Preview: Leaked Highlights Signal Major AI Hardware and Software Breakthroughs
Early leaks from NVIDIA's upcoming GTC 2025 conference reveal significant advancements in AI hardware, software frameworks, and robotics. The preview suggests major performance leaps and new capabilities that could reshape AI development across industries.
Google's Gemini 3.0 Pro Goes GA, 3.1 Pro Preview Teased in Major AI Push
Google is reportedly launching Gemini 3.0 Pro into general availability today while offering a preview of the next-generation Gemini 3.1 Pro. This dual announcement signals Google's aggressive roadmap to compete in the advanced AI assistant space.
Claude Code's New 'Auto Mode' Preview: What's Allowed, What's Blocked, and How to Get Access
Anthropic's new safety classifier for Claude Code autonomously executes safe actions while blocking risky ones. Here's how it works and how to use it.
3 MCP Patterns That Make Your Claude Code Agent Production-Ready
Move beyond basic MCP servers with capability manifests, guardrails, and checkpointing to build reliable Claude Code agents that can run autonomously.
Generate Production-Ready Business Websites in Minutes with /letsgo
Use the new /letsgo command in Claude Code to instantly scaffold a complete, customizable static site for restaurants, salons, gyms, or professional services.
Claude Mythos Helped Firefox Fix More Bugs in April Than 15 Prior Months Combined
Firefox fixed more security bugs in April 2026 than 15 prior months combined, using Anthropic's Claude Mythos Preview model for triage and patching.
Tencent's HY3 AI Model Has 295B Params, Led by Ex-OpenAI Researcher
Tencent unveiled its HY3 preview model, its most powerful yet with 295 billion parameters. It's already deployed in consumer app Yuanbao and coding assistant CodeBuddy.
Anthropic Opus 4.7: 87.6% SWE-Bench, Constrained Cyber Capabilities
Anthropic released Claude Opus 4.7 on April 16, 2026, achieving 87.6% on SWE-Bench Verified and 64.3% on SWE-Bench Pro — leading GPT-5.4 and Gemini 3.1 Pro. The company also confirmed it deliberately constrained cybersecurity capabilities in Opus 4.7, with the more powerful Mythos Preview model (83.1% on CyberGym) restricted to select partners.
NSA Uses Anthropic's Claude Mythos Despite 'Supply Chain Risk' Label
The National Security Agency is using Anthropic's Claude Mythos Preview for its capabilities, despite having labeled Anthropic itself as a potential supply chain risk. This highlights the tension between security concerns and the operational need for cutting-edge AI.
Anthropic's Claude Mythos Scores 83.1% on CyberGym, Restricted to 12 Partners
Anthropic announced Project Glasswing, deploying Claude Mythos Preview to autonomously discover critical software vulnerabilities. Scoring 83.1% on CyberGym, it's restricted to 12 launch partners due to dual-use risks, with a 90-day disclosure window.
Anthropic Launches Project Glasswing for Critical Software Security
Anthropic announced Project Glasswing, an urgent initiative to secure critical software, powered by its new frontier model Claude Mythos Preview, which it claims can find vulnerabilities better than all but the most skilled humans.
GPT-Image-2 Appears in ChatGPT App Images Tab, Signaling OpenAI Visual AI Push
A user spotted 'GPT-Image-2' listed in the images tab of the ChatGPT mobile app. This indicates OpenAI is testing a potential successor to its DALL-E image generation models directly within its flagship product.
Claude Code's 'Safety Layer' Leak Reveals Why Your CLAUDE.md Isn't Enough
Claude Code's leaked safety system is just a prompt. For production agents, you need runtime enforcement, not just polite requests.
OpenAI Expands Codex Plugin Ecosystem to Slack, Figma, Notion, and Gmail
OpenAI has rolled out new plugins connecting its Codex model to productivity tools like Slack, Figma, Notion, and Gmail, moving code generation beyond the IDE into broader workflows.
Figure AI CEO Brett Adcock Teases 'Hark': A 'Bespoke Natural Language' Interface for AI
Figure AI CEO Brett Adcock previewed 'Hark,' described as a new natural language interface for AI. The brief teaser suggests a move toward more intuitive, conversational control systems, potentially for robotics.
Claude Code's New Auto-Mode: How to Configure It for Maximum Autonomy
Anthropic has expanded Claude Code's auto-mode preview, letting it execute safe actions without manual approval. Here's how to configure it for your workflow.
How to Delegate UI Verification and PR Creation to Claude Code
Stop manually checking UI changes and writing PRs. Use Claude Code's preview feature and custom skills to automate verification and delegation.
Switchboard's Grid View Gives You Bird's-Eye Control of Claude Code Sessions
Switchboard v0.0.16 adds a grid view that shows all your Claude Code sessions at once with live terminal previews, status indicators, and quick navigation.