frontier risks

30 articles about frontier risks in AI news

Anthropic Seeks Chemical Weapons Expert for AI Safety Team, Signaling Focus on CBRN Risks

Anthropic is hiring a Chemical, Biological, Radiological, and Nuclear (CBRN) weapons expert for its AI safety team. The role focuses on assessing and mitigating catastrophic risks from frontier AI models.

Mar 23, 202687% relevant

Game Theory Exposes Critical Gaps in AI Safety: New Benchmark Reveals Multi-Agent Risks

Researchers have developed GT-HarmBench, a groundbreaking benchmark testing AI safety through game theory. The study reveals frontier models choose socially beneficial actions only 62% of time in multi-agent scenarios, highlighting significant coordination risks.

Feb 12, 202675% relevant

Researchers Study AI Mental Health Risks Using Simulated Teen 'Bridget'

A research team created a ChatGPT account for a simulated 13-year-old girl named 'Bridget' to study AI interaction risks with depressed, lonely teens. The experiment underscores urgent safety and ethical questions for generative AI developers.

Apr 14, 202685% relevant

Frontier AI Advised Patient on Benzodiazepine Taper, Sparking Safety Debate

A social media post detailed how a frontier AI model generated a personalized tapering schedule for alprazolam (Xanax) when a user said their psychiatrist retired. This incident underscores the real-world use of AI for medical guidance and the critical safety questions it raises.

Apr 13, 202685% relevant

OpenAI Shelves 'Adult Mode' Chatbot Indefinitely, Citing Safety Risks and Strategic Refocus

OpenAI has canceled its planned erotic chatbot feature after internal pushback over risks to minors and technical safety challenges. The move is part of a broader shift away from experimental 'side quests' toward core productivity tools.

Mar 26, 202692% relevant

Grocery Dive Asks: Is Agentic AI the Next Frontier for Grocers?

The article examines agentic AI's potential for grocers in inventory, personalization, and store operations, weighing benefits against implementation challenges like data integration and safety.

Apr 24, 202680% relevant

Agentic AI Checkout Emerges as Next Frontier in Retail Transformation

Multiple industry reports from Deloitte, Bain, and retail publications highlight the shift toward 'agentic AI' in commerce—systems that autonomously execute complex shopping tasks. This evolution promises to redefine the online basket and checkout experience, with Asia Pacific flagged as a key growth region.

Apr 15, 202684% relevant

Mercor Data Breach Exposes Expert Human Annotation Pipeline Used by Frontier AI Labs

Hackers have reportedly accessed Mercor's expert human data collection systems, which are used by leading AI labs to build foundation models. This breach could expose proprietary training methodologies and sensitive model development data.

Apr 1, 202691% relevant

Violoop's Hardware Bet: A New Frontier in AI Interaction Beyond the Screen

Hardware startup Violoop has secured multi-million dollar funding to develop the world's first 'physical-level AI Operator,' aiming to move AI interaction from purely digital interfaces to tangible, desktop-integrated hardware devices.

Mar 13, 202695% relevant

Anthropic CEO's Internal Memo Reveals Strategic Shift Toward 'AI Agents' as Next Frontier

Anthropic CEO Dario Amodei has reportedly directed his company to pivot toward developing AI agents capable of performing complex, multi-step tasks autonomously. This strategic memo signals a major shift in the AI landscape beyond today's chatbots toward more capable, action-oriented systems.

Mar 5, 202685% relevant

Securing the Conversational Commerce Frontier: AI Agent Fraud Protection for Luxury Retail

Riskified expands its AI platform to secure native shopping chatbots and AI agents. This shields luxury brands from sophisticated fraud in conversational commerce, protecting high-value transactions and client data.

Mar 5, 202685% relevant

Treasury Secretary Calls Claude Mythos a 'Step Function Change' in AI

US Treasury Secretary Janet Yellen described Anthropic's Claude Mythos as a 'step function change in abilities' at a WSJ event. This follows emergency meetings with Wall Street CEOs and high-level briefings on AI cyber risks, revealing a government split on whether Anthropic is a security risk or asset.

Apr 15, 202695% relevant

Anthropic's Claude Mythos Scores 83.1% on CyberGym, Restricted to 12 Partners

Anthropic announced Project Glasswing, deploying Claude Mythos Preview to autonomously discover critical software vulnerabilities. Scoring 83.1% on CyberGym, it's restricted to 12 launch partners due to dual-use risks, with a 90-day disclosure window.

Apr 12, 202686% relevant

US Officials Warn Anthropic's 'Mythos' AI Poses Major Cybersecurity Threat

Senior US officials, including Jerome Powell, warn that Anthropic's highly advanced 'Mythos' AI model presents significant cybersecurity risks. Its powerful ability to find system vulnerabilities requires tight restrictions to prevent misuse.

Apr 10, 202695% relevant

Anthropic Withholds 'Mythos' AI Model Citing Unspecified Risk Concerns

Anthropic has reportedly chosen to withhold a new AI model, internally called 'Mythos', from public release. The decision is based on an internal assessment of potential risks, though specific capabilities or benchmarks were not disclosed.

Apr 9, 202689% relevant

Anthropic's 'Project Glassing' Opus-Beater Restricted to Security Researchers

Anthropic's new model, which outperforms Claude 3 Opus, is being released under 'Project Glassing' exclusively to vetted security researchers. This controlled rollout follows recent warnings from security experts about advanced AI risks.

Apr 7, 202685% relevant

Anthropic Warns Upcoming LLMs Could Cause 'Serious Damage'

Anthropic has issued a stark warning that its upcoming large language models could cause 'serious damage.' The company states there is 'no end in sight' to capability scaling and proliferation risks.

Apr 7, 202685% relevant

Anthropic's 'Spud' Model Expected in April, 'Mythos' in Q3 2026 as AI Release Cadence Accelerates

Anthropic's next major frontier model 'Spud' is reportedly scheduled for release in April 2026, with 'Mythos' potentially following in Q3. This aligns with an accelerating ~3-month release cadence across major labs, intensifying competition amid growing compute and energy bottlenecks.

Mar 29, 202689% relevant

Anthropic's Opus 5 and OpenAI's 'Spud' Rumored as Major AI Leaps, Prompting Security Concerns

A Fortune report, cited on social media, claims Anthropic's upcoming Opus 5 model is a 'massive leap' from Claude 3.5 Sonnet, posing significant security risks. OpenAI is also rumored to have a similarly advanced model, 'Spud,' in development.

Mar 27, 202695% relevant

Agentic AI Shopping Bots Are Coming: Payment Giants and Retailers Are Building Them, Banks Are Scrambling

Major payment networks (Visa, Mastercard, PayPal) and retailers (Google, Walmart, Amazon) are developing autonomous AI shopping agents. This creates urgent operational and liability risks for banks, including unprecedented charge-back disputes and fraud exposure.

Mar 18, 202674% relevant

AgentDrift: How Corrupted Tool Data Causes Unsafe Recommendations in LLM Agents

New research reveals LLM agents making product recommendations can maintain ranking quality while suggesting unsafe items when their tools provide corrupted data. Standard metrics like NDCG fail to detect this safety drift, creating hidden risks for high-stakes applications.

Mar 16, 202695% relevant

Anthropic Launches Institute to Warn Public About AI's Rapid Self-Improvement and Job Disruption

Anthropic has established The Anthropic Institute to publicly share internal research on AI capabilities, warning of imminent job disruptions and legal challenges. Led by Jack Clark, the initiative aims to bridge frontier AI development with public awareness as models approach recursive self-improvement.

Mar 11, 202697% relevant

Meissa: The 4B-Parameter Medical AI That Outperforms Giants While Running Offline

Researchers have developed Meissa, a lightweight 4B-parameter medical AI that matches or exceeds proprietary frontier models in clinical tasks while operating fully offline with 22x lower latency. This breakthrough addresses critical cost, privacy, and deployment barriers in healthcare AI.

Mar 11, 202677% relevant

Anthropic CEO Warns of Dual Threat: Corporate AI Power vs. Government Overreach

Anthropic CEO Dario Amodei warns of the dual risks in AI governance: corporations becoming more powerful than governments, and governments becoming too powerful to be checked. This highlights the delicate balance needed in AI regulation.

Mar 7, 202685% relevant

The Hidden Bias in AI Image Generators: Why 'Perfect' Training Can Leak Private Data

New research reveals diffusion models continue to memorize training data even after achieving optimal test performance, creating privacy risks. This 'biased generalization' phase occurs when models learn fine details that overfit to specific samples rather than general patterns.

Mar 5, 202675% relevant

Nvidia's Strategic Bet: Fueling India's AI Revolution Through Venture Capital Partnerships

Nvidia is partnering with major venture capital firms to identify and fund India's next generation of AI startups, leveraging its global startup program that already includes over 4,000 Indian companies. This strategic move coincides with massive infrastructure investments like Yotta's $2 billion Nvidia chip purchase, positioning India as a critical frontier in the global AI race.

Feb 18, 202675% relevant

Beyond Keywords: How Google's AI Mode Revolutionizes Visual Discovery for Luxury Retail

Google's AI Mode uses advanced multimodal AI to understand the intent behind visual searches. For luxury brands, this means customers can find products using complex, subjective descriptions, unlocking a new frontier in visual commerce and inspiration-based discovery.

Mar 5, 202685% relevant

Wireless Brain Implant Restores Sight in Third Human Patient

Wireless brain implant with 544 electrodes achieves third human implantation, bypassing eyes to create artificial sight via direct visual cortex stimulation.

May 8, 202687% relevant

Anthropic Unveils TAI Research Agenda Targeting AI Economics, Threats, R&D

Anthropic's TAI will study four areas: economic diffusion, threats, wild AI, and AI-driven R&D. No budget disclosed.

May 7, 202685% relevant

Europe's AI Ambition Gap: No Energy, No Data Centers, No Strategy

Europe lacks a strategy for AI, with no energy or data center plan, per @kimmonismus. Only minor EU AI Act concessions offered.

May 3, 202683% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety