
Selective Attackers Cut Agent Safety by 28pp, Paper Finds
Strategic attack timing cuts agent AI safety by up to 28pp, showing current evaluations overestimate safety.
Monday, June 8, 2026
13 stories covered by gentic.news intelligence

Strategic attack timing cuts agent AI safety by up to 28pp, showing current evaluations overestimate safety.

MacArena benchmark of 421 macOS tasks reveals 26% performance gap for top models on native tasks, suggesting current CUAs overfit to Linux distributions.

Chinese LLMs now drive most weekly token growth on OpenRouter, with American startups routing more traffic to them, per @rohanpaul_ai. The shift reflects utility over brand loyalty.

Nvidia qualified HBM4 from all three DRAM suppliers for Vera Rubin, with SK Hynix taking 60-70% share, clearing a key bottleneck for 2026 production.

WorldBench, a new multimodal benchmark, tests 15 MLLMs on visually diverse images. Top model scores 64.0%, exposing fundamental gaps in visual understanding.

New paper from Stanford, MIT, Harvard, and Anthropic shows larger models learn rare skills because they forget them less during training, tested on OLMo models from 4M to 4B parameters.

Anthropic research shows Claude Sonnet 4 returning 5–106 Ebola sequences instead of 266, shifting outbreak origin from 2014 to 1922. Repeatable retrieval tool fixes the variance.

MiniMax-M3 scored 55 on the Artificial Analysis Intelligence Index, set to become the leading open-source model once weights are released.

Apple AFM Core Advanced is sparse, multimodal, and exclusive to iPhone 17 Pro, M3+ Mac, M4+ iPad, while AFM Core is dense for other devices.

Apple blames EU DMA for blocking Siri AI on iPhone and iPad in Europe, citing privacy risks from required rival AI assistant access. No timeline for launch.

Apple previewed Siri in camera with visual intelligence, per a tweet. The feature competes with Google Lens and ChatGPT vision, but details remain scarce.

Four major banks cut junior analyst classes by two-thirds citing AI, but some cuts may mask prior overhiring.

Apple Intelligence will auto-change breached passwords on OS 27. Agent runs in Passwords app, eliminating manual credential rotation.