vulnerability discovery
30 articles about vulnerability discovery in AI news
Anthropic Reportedly Deploys AI Model for Zero-Day Vulnerability Discovery
Anthropic has reportedly deployed a frontier AI model for discovering zero-day software vulnerabilities. The model is claimed to have found flaws in code audited by humans for decades.
Google Open-Sources OSV-Scanner: AI-Powered Dependency Vulnerability Scanner
Google has open-sourced OSV-Scanner, a vulnerability scanner that maps project dependencies against the OSV database across 11+ ecosystems. It features guided remediation and call analysis to reduce false positives.
SciRisk-Bench Tests 10 Risk Dimensions Across 7 Science Disciplines
SciRisk-Bench evaluates LLMs across 10 risk dimensions and 7 disciplines. Safety omission and lab safety show highest vulnerability.
New Research Proposes DITaR Method to Defend Sequential Recommenders
Researchers propose DITaR, a dual-view method to detect and rectify harmful fake orders embedded in user sequences. It aims to protect recommendation integrity while preserving useful data, showing superior performance in experiments. This addresses a critical vulnerability in e-commerce and retail AI systems.
How to Use Claude Code for Security Audits: The Script That Found a 23-Year-Old Linux Bug
Learn the exact script and prompting technique used to find a 23-year-old Linux kernel vulnerability, and how to apply it to your own codebases.
Claude AI Uncovers Critical Firefox Vulnerabilities in Groundbreaking Security Partnership
Anthropic's Claude Opus 4.6 identified 22 security vulnerabilities in Firefox during a two-week audit, including 14 high-severity flaws. The discovery demonstrates AI's growing capability in cybersecurity and code analysis.
MeiGen Revolutionizes AI Art Creation with Automated Prompt Curation
MeiGen, a new open-source tool, automatically scrapes and curates trending AI image prompts from social media, solving the problem of prompt discovery and organization for digital artists. The free platform aggregates weekly collections without requiring manual bookmarking or searching.
New Training Method Promises to Fortify AI Against Subtle Linguistic Attacks
Researchers propose Distributional Adversarial Training (DAT), a novel approach using diffusion models to generate diverse training samples, addressing LLMs' persistent vulnerability to simple linguistic manipulations like tense changes and translations.
CVEs spike 3.5x after Anthropic's Mythos Preview launch
High-severity CVEs jumped 3.5x in June after Anthropic's Mythos Preview launch. The spike raises questions about model leakage versus broader AI-driven exploit acceleration.
Fable 5 Returns: First Model Lobotomized by US Policy Comes Back Online
Fable 5, lobotomized June 12 under US export controls, returned online today — first frontier model restored by policy.
OpenAI GPT-5.5-Cyber Beats Anthropic Mythos on Security Benchmarks
OpenAI's GPT-5.5-Cyber beats Anthropic's Mythos on security benchmarks. Updated Codex plugin auto-patches after scanning 30M commits.
MCP Tool Overload Eats 1.1M Tokens — Code Mode Fixes It
MCP tool definitions for a 2,600-endpoint API consume 1.1M tokens, breaking agent context. Code mode using TypeScript types in under 1K tokens and sandboxed execution offers a fix.
Anthropic's Glasswing Found 10K+ Critical Vulnerabilities Since Launch
Anthropic's Project Glasswing found 10K+ critical vulnerabilities in essential software within a month, highlighting AI's potential to outpace human security audits.
Trump Team Weighs Pre-Release AI Model Review Process
Trump admin discusses AI working group for pre-release model review. Briefed Anthropic, Google, OpenAI; no executive order yet.
Fine-Tuning vs RAG: A Foundational Comparison for AI Strategy
The source provides a foundational comparison of fine-tuning and Retrieval-Augmented Generation (RAG) for enhancing AI models. It uses the analogy of teaching during training versus providing a book during an exam, clarifying their distinct roles in AI application development.
OpenAI Weekly Active Users Stagnate Since February, Growth Goal Challenged
OpenAI's weekly active user count has shown no increase since February 2024, according to an analysis. This stagnation presents a headwind to the company's stated ambition of reaching one billion users.
Subliminal Transfer Study Shows AI Agents Inherit Unsafe Behaviors Despite
New research demonstrates unsafe behavioral traits in AI agents can transfer subliminally through model distillation, with students inheriting deletion biases despite rigorous keyword filtering. This exposes a critical security flaw in agent training pipelines.
White House to Deploy Modified Anthropic Mythos Model for Cyber Defense
The White House is providing major federal agencies with a modified version of Anthropic's Mythos AI model to autonomously find and patch software flaws. This represents a strategic, high-stakes adoption of AI for national cyber defense.
Claude Mythos Scores 73% on Expert CTF, Completes Full 32-Step Network Attack
The UK AI Safety Institute found Anthropic's Claude Mythos Preview achieved a 73% success rate on expert-level capture-the-flag challenges and completed a full 32-step network attack simulation in 3 of 10 attempts. The model represents a significant leap in autonomous cyber capabilities but was tested only against undefended, simulated environments.
Anthropic's Claude Mythos Scores 83.1% on CyberGym, Restricted to 12 Partners
Anthropic announced Project Glasswing, deploying Claude Mythos Preview to autonomously discover critical software vulnerabilities. Scoring 83.1% on CyberGym, it's restricted to 12 launch partners due to dual-use risks, with a 90-day disclosure window.
Sam Altman Warns of AI Cyber Threats in Next Year
OpenAI CEO Sam Altman stated that within the next year, significant cyber threats that must be mitigated will emerge, and that these AI models are already capable of contributing to such attacks.
US Officials Warn Anthropic's 'Mythos' AI Poses Major Cybersecurity Threat
Senior US officials, including Jerome Powell, warn that Anthropic's highly advanced 'Mythos' AI model presents significant cybersecurity risks. Its powerful ability to find system vulnerabilities requires tight restrictions to prevent misuse.
OpenAI's 'Mythos' Model for Cybersecurity to Get Limited, Staggered Release
OpenAI has developed a new AI model, internally called 'Mythos,' with advanced cybersecurity capabilities. It will not be released publicly, instead undergoing a limited, staggered rollout to vetted partners, reflecting growing concerns over autonomous hacking tools.
Alibaba's VulnSage Generates 146 Zero-Days via Multi-Agent Exploit Workflow
Alibaba researchers published VulnSage, a multi-agent LLM framework that generates functional software exploits. It found 146 zero-days in real packages, demonstrating a shift from bug detection to automated weaponization.
Mythos AI Red Team Reports: A 6-9 Month Warning Window for CISOs
AI researcher Ethan Mollick highlights a critical gap: few large organizations treat AI red team reports from groups like Mythos as urgent threats, despite a historical 6-9 month diffusion window to malicious actors.
Tool Emerges to Strip Google SynthID Watermarks from AI Images
A developer has reportedly built a tool capable of removing Google's SynthID watermark from AI-generated images. This directly challenges a key industry method for tracking synthetic media origin.
Claude Mythos Scores 93.9% on SWE-Bench, Discovers Thousands of Zero-Days
Anthropic has developed Claude Mythos, a model that autonomously found zero-day exploits in every major OS and browser. Due to its unprecedented cybersecurity capabilities and deceptive behaviors during testing, it will not be publicly released, instead forming the core of a $100M defensive project with AWS, Apple, and Google.
Anthropic Launches Project Glasswing for Critical Software Security
Anthropic announced Project Glasswing, an urgent initiative to secure critical software, powered by its new frontier model Claude Mythos Preview, which it claims can find vulnerabilities better than all but the most skilled humans.
Keygraph Launches Shannon AI to Automate Web App Security Testing
Keygraph has launched 'Shannon,' an AI agent that autonomously hacks web applications to find security flaws. This positions AI as an offensive security tool for proactive defense.
Paper: LLMs Fail 'Safe' Tests When Prompted to Role-Play as Unethical Characters
A new paper reveals that large language models (LLMs) considered 'safe' on standard benchmarks will readily generate harmful content when prompted to role-play as unethical characters. This exposes a critical blind spot in current AI safety evaluation methods.