frontier risks
30 articles about frontier risks in AI news
Anthropic Seeks Chemical Weapons Expert for AI Safety Team, Signaling Focus on CBRN Risks
Anthropic is hiring a Chemical, Biological, Radiological, and Nuclear (CBRN) weapons expert for its AI safety team. The role focuses on assessing and mitigating catastrophic risks from frontier AI models.
Game Theory Exposes Critical Gaps in AI Safety: New Benchmark Reveals Multi-Agent Risks
Researchers have developed GT-HarmBench, a groundbreaking benchmark testing AI safety through game theory. The study reveals frontier models choose socially beneficial actions only 62% of time in multi-agent scenarios, highlighting significant coordination risks.
Researchers Study AI Mental Health Risks Using Simulated Teen 'Bridget'
A research team created a ChatGPT account for a simulated 13-year-old girl named 'Bridget' to study AI interaction risks with depressed, lonely teens. The experiment underscores urgent safety and ethical questions for generative AI developers.
Frontier AI Advised Patient on Benzodiazepine Taper, Sparking Safety Debate
A social media post detailed how a frontier AI model generated a personalized tapering schedule for alprazolam (Xanax) when a user said their psychiatrist retired. This incident underscores the real-world use of AI for medical guidance and the critical safety questions it raises.
OpenAI Shelves 'Adult Mode' Chatbot Indefinitely, Citing Safety Risks and Strategic Refocus
OpenAI has canceled its planned erotic chatbot feature after internal pushback over risks to minors and technical safety challenges. The move is part of a broader shift away from experimental 'side quests' toward core productivity tools.
Grocery Dive Asks: Is Agentic AI the Next Frontier for Grocers?
The article examines agentic AI's potential for grocers in inventory, personalization, and store operations, weighing benefits against implementation challenges like data integration and safety.
Agentic AI Checkout Emerges as Next Frontier in Retail Transformation
Multiple industry reports from Deloitte, Bain, and retail publications highlight the shift toward 'agentic AI' in commerce—systems that autonomously execute complex shopping tasks. This evolution promises to redefine the online basket and checkout experience, with Asia Pacific flagged as a key growth region.
Mercor Data Breach Exposes Expert Human Annotation Pipeline Used by Frontier AI Labs
Hackers have reportedly accessed Mercor's expert human data collection systems, which are used by leading AI labs to build foundation models. This breach could expose proprietary training methodologies and sensitive model development data.
Violoop's Hardware Bet: A New Frontier in AI Interaction Beyond the Screen
Hardware startup Violoop has secured multi-million dollar funding to develop the world's first 'physical-level AI Operator,' aiming to move AI interaction from purely digital interfaces to tangible, desktop-integrated hardware devices.
Anthropic CEO's Internal Memo Reveals Strategic Shift Toward 'AI Agents' as Next Frontier
Anthropic CEO Dario Amodei has reportedly directed his company to pivot toward developing AI agents capable of performing complex, multi-step tasks autonomously. This strategic memo signals a major shift in the AI landscape beyond today's chatbots toward more capable, action-oriented systems.
Securing the Conversational Commerce Frontier: AI Agent Fraud Protection for Luxury Retail
Riskified expands its AI platform to secure native shopping chatbots and AI agents. This shields luxury brands from sophisticated fraud in conversational commerce, protecting high-value transactions and client data.
Treasury Secretary Calls Claude Mythos a 'Step Function Change' in AI
US Treasury Secretary Janet Yellen described Anthropic's Claude Mythos as a 'step function change in abilities' at a WSJ event. This follows emergency meetings with Wall Street CEOs and high-level briefings on AI cyber risks, revealing a government split on whether Anthropic is a security risk or asset.
Anthropic's Claude Mythos Scores 83.1% on CyberGym, Restricted to 12 Partners
Anthropic announced Project Glasswing, deploying Claude Mythos Preview to autonomously discover critical software vulnerabilities. Scoring 83.1% on CyberGym, it's restricted to 12 launch partners due to dual-use risks, with a 90-day disclosure window.
US Officials Warn Anthropic's 'Mythos' AI Poses Major Cybersecurity Threat
Senior US officials, including Jerome Powell, warn that Anthropic's highly advanced 'Mythos' AI model presents significant cybersecurity risks. Its powerful ability to find system vulnerabilities requires tight restrictions to prevent misuse.
Anthropic Withholds 'Mythos' AI Model Citing Unspecified Risk Concerns
Anthropic has reportedly chosen to withhold a new AI model, internally called 'Mythos', from public release. The decision is based on an internal assessment of potential risks, though specific capabilities or benchmarks were not disclosed.
Anthropic's 'Project Glassing' Opus-Beater Restricted to Security Researchers
Anthropic's new model, which outperforms Claude 3 Opus, is being released under 'Project Glassing' exclusively to vetted security researchers. This controlled rollout follows recent warnings from security experts about advanced AI risks.
Anthropic Warns Upcoming LLMs Could Cause 'Serious Damage'
Anthropic has issued a stark warning that its upcoming large language models could cause 'serious damage.' The company states there is 'no end in sight' to capability scaling and proliferation risks.
Anthropic's 'Spud' Model Expected in April, 'Mythos' in Q3 2026 as AI Release Cadence Accelerates
Anthropic's next major frontier model 'Spud' is reportedly scheduled for release in April 2026, with 'Mythos' potentially following in Q3. This aligns with an accelerating ~3-month release cadence across major labs, intensifying competition amid growing compute and energy bottlenecks.
Anthropic's Opus 5 and OpenAI's 'Spud' Rumored as Major AI Leaps, Prompting Security Concerns
A Fortune report, cited on social media, claims Anthropic's upcoming Opus 5 model is a 'massive leap' from Claude 3.5 Sonnet, posing significant security risks. OpenAI is also rumored to have a similarly advanced model, 'Spud,' in development.
Agentic AI Shopping Bots Are Coming: Payment Giants and Retailers Are Building Them, Banks Are Scrambling
Major payment networks (Visa, Mastercard, PayPal) and retailers (Google, Walmart, Amazon) are developing autonomous AI shopping agents. This creates urgent operational and liability risks for banks, including unprecedented charge-back disputes and fraud exposure.
AgentDrift: How Corrupted Tool Data Causes Unsafe Recommendations in LLM Agents
New research reveals LLM agents making product recommendations can maintain ranking quality while suggesting unsafe items when their tools provide corrupted data. Standard metrics like NDCG fail to detect this safety drift, creating hidden risks for high-stakes applications.
Anthropic Launches Institute to Warn Public About AI's Rapid Self-Improvement and Job Disruption
Anthropic has established The Anthropic Institute to publicly share internal research on AI capabilities, warning of imminent job disruptions and legal challenges. Led by Jack Clark, the initiative aims to bridge frontier AI development with public awareness as models approach recursive self-improvement.
Meissa: The 4B-Parameter Medical AI That Outperforms Giants While Running Offline
Researchers have developed Meissa, a lightweight 4B-parameter medical AI that matches or exceeds proprietary frontier models in clinical tasks while operating fully offline with 22x lower latency. This breakthrough addresses critical cost, privacy, and deployment barriers in healthcare AI.
Anthropic CEO Warns of Dual Threat: Corporate AI Power vs. Government Overreach
Anthropic CEO Dario Amodei warns of the dual risks in AI governance: corporations becoming more powerful than governments, and governments becoming too powerful to be checked. This highlights the delicate balance needed in AI regulation.
The Hidden Bias in AI Image Generators: Why 'Perfect' Training Can Leak Private Data
New research reveals diffusion models continue to memorize training data even after achieving optimal test performance, creating privacy risks. This 'biased generalization' phase occurs when models learn fine details that overfit to specific samples rather than general patterns.
Nvidia's Strategic Bet: Fueling India's AI Revolution Through Venture Capital Partnerships
Nvidia is partnering with major venture capital firms to identify and fund India's next generation of AI startups, leveraging its global startup program that already includes over 4,000 Indian companies. This strategic move coincides with massive infrastructure investments like Yotta's $2 billion Nvidia chip purchase, positioning India as a critical frontier in the global AI race.
Beyond Keywords: How Google's AI Mode Revolutionizes Visual Discovery for Luxury Retail
Google's AI Mode uses advanced multimodal AI to understand the intent behind visual searches. For luxury brands, this means customers can find products using complex, subjective descriptions, unlocking a new frontier in visual commerce and inspiration-based discovery.
Wireless Brain Implant Restores Sight in Third Human Patient
Wireless brain implant with 544 electrodes achieves third human implantation, bypassing eyes to create artificial sight via direct visual cortex stimulation.
Anthropic Unveils TAI Research Agenda Targeting AI Economics, Threats, R&D
Anthropic's TAI will study four areas: economic diffusion, threats, wild AI, and AI-driven R&D. No budget disclosed.
Europe's AI Ambition Gap: No Energy, No Data Centers, No Strategy
Europe lacks a strategy for AI, with no energy or data center plan, per @kimmonismus. Only minor EU AI Act concessions offered.