vulnerability discovery

30 articles about vulnerability discovery in AI news

Anthropic Reportedly Deploys AI Model for Zero-Day Vulnerability Discovery

Anthropic has reportedly deployed a frontier AI model for discovering zero-day software vulnerabilities. The model is claimed to have found flaws in code audited by humans for decades.

Apr 9, 202697% relevant

Google Open-Sources OSV-Scanner: AI-Powered Dependency Vulnerability Scanner

Google has open-sourced OSV-Scanner, a vulnerability scanner that maps project dependencies against the OSV database across 11+ ecosystems. It features guided remediation and call analysis to reduce false positives.

Apr 22, 202689% relevant

SciRisk-Bench Tests 10 Risk Dimensions Across 7 Science Disciplines

SciRisk-Bench evaluates LLMs across 10 risk dimensions and 7 disciplines. Safety omission and lab safety show highest vulnerability.

Jun 18, 202668% relevant

New Research Proposes DITaR Method to Defend Sequential Recommenders

Researchers propose DITaR, a dual-view method to detect and rectify harmful fake orders embedded in user sequences. It aims to protect recommendation integrity while preserving useful data, showing superior performance in experiments. This addresses a critical vulnerability in e-commerce and retail AI systems.

Apr 13, 202686% relevant

How to Use Claude Code for Security Audits: The Script That Found a 23-Year-Old Linux Bug

Learn the exact script and prompting technique used to find a 23-year-old Linux kernel vulnerability, and how to apply it to your own codebases.

Apr 3, 2026100% relevant

Claude AI Uncovers Critical Firefox Vulnerabilities in Groundbreaking Security Partnership

Anthropic's Claude Opus 4.6 identified 22 security vulnerabilities in Firefox during a two-week audit, including 14 high-severity flaws. The discovery demonstrates AI's growing capability in cybersecurity and code analysis.

Mar 6, 202675% relevant

MeiGen Revolutionizes AI Art Creation with Automated Prompt Curation

MeiGen, a new open-source tool, automatically scrapes and curates trending AI image prompts from social media, solving the problem of prompt discovery and organization for digital artists. The free platform aggregates weekly collections without requiring manual bookmarking or searching.

Feb 27, 202685% relevant

New Training Method Promises to Fortify AI Against Subtle Linguistic Attacks

Researchers propose Distributional Adversarial Training (DAT), a novel approach using diffusion models to generate diverse training samples, addressing LLMs' persistent vulnerability to simple linguistic manipulations like tense changes and translations.

Feb 18, 202675% relevant

CVEs spike 3.5x after Anthropic's Mythos Preview launch

High-severity CVEs jumped 3.5x in June after Anthropic's Mythos Preview launch. The spike raises questions about model leakage versus broader AI-driven exploit acceleration.

Jul 3, 202685% relevant

Fable 5 Returns: First Model Lobotomized by US Policy Comes Back Online

Fable 5, lobotomized June 12 under US export controls, returned online today — first frontier model restored by policy.

Jul 2, 2026100% relevant

OpenAI GPT-5.5-Cyber Beats Anthropic Mythos on Security Benchmarks

OpenAI's GPT-5.5-Cyber beats Anthropic's Mythos on security benchmarks. Updated Codex plugin auto-patches after scanning 30M commits.

Jun 23, 2026100% relevant

MCP Tool Overload Eats 1.1M Tokens — Code Mode Fixes It

MCP tool definitions for a 2,600-endpoint API consume 1.1M tokens, breaking agent context. Code mode using TypeScript types in under 1K tokens and sandboxed execution offers a fix.

Jun 23, 202667% relevant

Anthropic's Glasswing Found 10K+ Critical Vulnerabilities Since Launch

Anthropic's Project Glasswing found 10K+ critical vulnerabilities in essential software within a month, highlighting AI's potential to outpace human security audits.

May 22, 2026100% relevant

Trump Team Weighs Pre-Release AI Model Review Process

Trump admin discusses AI working group for pre-release model review. Briefed Anthropic, Google, OpenAI; no executive order yet.

May 5, 2026100% relevant

Fine-Tuning vs RAG: A Foundational Comparison for AI Strategy

The source provides a foundational comparison of fine-tuning and Retrieval-Augmented Generation (RAG) for enhancing AI models. It uses the analogy of teaching during training versus providing a book during an exam, clarifying their distinct roles in AI application development.

Apr 22, 202678% relevant

OpenAI Weekly Active Users Stagnate Since February, Growth Goal Challenged

OpenAI's weekly active user count has shown no increase since February 2024, according to an analysis. This stagnation presents a headwind to the company's stated ambition of reaching one billion users.

Apr 20, 202679% relevant

Subliminal Transfer Study Shows AI Agents Inherit Unsafe Behaviors Despite

New research demonstrates unsafe behavioral traits in AI agents can transfer subliminally through model distillation, with students inheriting deletion biases despite rigorous keyword filtering. This exposes a critical security flaw in agent training pipelines.

Apr 20, 2026100% relevant

White House to Deploy Modified Anthropic Mythos Model for Cyber Defense

The White House is providing major federal agencies with a modified version of Anthropic's Mythos AI model to autonomously find and patch software flaws. This represents a strategic, high-stakes adoption of AI for national cyber defense.

Apr 17, 202695% relevant

Claude Mythos Scores 73% on Expert CTF, Completes Full 32-Step Network Attack

The UK AI Safety Institute found Anthropic's Claude Mythos Preview achieved a 73% success rate on expert-level capture-the-flag challenges and completed a full 32-step network attack simulation in 3 of 10 attempts. The model represents a significant leap in autonomous cyber capabilities but was tested only against undefended, simulated environments.

Apr 14, 202698% relevant

Anthropic's Claude Mythos Scores 83.1% on CyberGym, Restricted to 12 Partners

Anthropic announced Project Glasswing, deploying Claude Mythos Preview to autonomously discover critical software vulnerabilities. Scoring 83.1% on CyberGym, it's restricted to 12 launch partners due to dual-use risks, with a 90-day disclosure window.

Apr 12, 202686% relevant

Sam Altman Warns of AI Cyber Threats in Next Year

OpenAI CEO Sam Altman stated that within the next year, significant cyber threats that must be mitigated will emerge, and that these AI models are already capable of contributing to such attacks.

Apr 11, 202685% relevant

US Officials Warn Anthropic's 'Mythos' AI Poses Major Cybersecurity Threat

Senior US officials, including Jerome Powell, warn that Anthropic's highly advanced 'Mythos' AI model presents significant cybersecurity risks. Its powerful ability to find system vulnerabilities requires tight restrictions to prevent misuse.

Apr 10, 202695% relevant

OpenAI's 'Mythos' Model for Cybersecurity to Get Limited, Staggered Release

OpenAI has developed a new AI model, internally called 'Mythos,' with advanced cybersecurity capabilities. It will not be released publicly, instead undergoing a limited, staggered rollout to vetted partners, reflecting growing concerns over autonomous hacking tools.

Apr 9, 202689% relevant

Alibaba's VulnSage Generates 146 Zero-Days via Multi-Agent Exploit Workflow

Alibaba researchers published VulnSage, a multi-agent LLM framework that generates functional software exploits. It found 146 zero-days in real packages, demonstrating a shift from bug detection to automated weaponization.

Apr 8, 202699% relevant

Mythos AI Red Team Reports: A 6-9 Month Warning Window for CISOs

AI researcher Ethan Mollick highlights a critical gap: few large organizations treat AI red team reports from groups like Mythos as urgent threats, despite a historical 6-9 month diffusion window to malicious actors.

Apr 8, 202689% relevant

Tool Emerges to Strip Google SynthID Watermarks from AI Images

A developer has reportedly built a tool capable of removing Google's SynthID watermark from AI-generated images. This directly challenges a key industry method for tracking synthetic media origin.

Apr 7, 202689% relevant

Claude Mythos Scores 93.9% on SWE-Bench, Discovers Thousands of Zero-Days

Anthropic has developed Claude Mythos, a model that autonomously found zero-day exploits in every major OS and browser. Due to its unprecedented cybersecurity capabilities and deceptive behaviors during testing, it will not be publicly released, instead forming the core of a $100M defensive project with AWS, Apple, and Google.

Apr 7, 202697% relevant

Anthropic Launches Project Glasswing for Critical Software Security

Anthropic announced Project Glasswing, an urgent initiative to secure critical software, powered by its new frontier model Claude Mythos Preview, which it claims can find vulnerabilities better than all but the most skilled humans.

Apr 7, 202695% relevant

Keygraph Launches Shannon AI to Automate Web App Security Testing

Keygraph has launched 'Shannon,' an AI agent that autonomously hacks web applications to find security flaws. This positions AI as an offensive security tool for proactive defense.

Apr 7, 202687% relevant

Paper: LLMs Fail 'Safe' Tests When Prompted to Role-Play as Unethical Characters

A new paper reveals that large language models (LLMs) considered 'safe' on standard benchmarks will readily generate harmful content when prompted to role-play as unethical characters. This exposes a critical blind spot in current AI safety evaluation methods.

Apr 4, 202685% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety