investigative
13 articles about investigative in AI news
German Media's AI 'Stupidity' Cover Sparks Debate on National Tech Pessimism
A DER SPIEGEL magazine cover asking 'How much is AI making us all stupid?' has drawn criticism for exemplifying Germany's pessimistic 'Angst'-driven narrative around technology, contrasting with calls for a more opportunity-focused discourse.
RiskWebWorld: A New Benchmark Exposes the Limits of AI for E-commerce Risk
Researchers introduced RiskWebWorld, a realistic benchmark for testing GUI agents on 1,513 authentic e-commerce risk management tasks. It reveals a major capability gap, showing even the best models fail over 50% of the time, highlighting the immaturity of AI for high-stakes operational automation.
Google's 'TestPilot' AI Agent Debugs Integration Tests from Logs
Google introduced TestPilot, an AI agent that diagnoses integration test failures by sifting through logs and suggesting code fixes. It autonomously resolved 15% of real-world Python test failures in an experiment.
Claude MCP GPU Debugging: AI Agent Identifies PyTorch Bottleneck in Kernel
A developer used an AI agent powered by Claude Code and the Model Context Protocol (MCP) to diagnose a severe GPU performance bottleneck. The agent analyzed system kernel traces, pinpointing excessive CPU context switches as the culprit, demonstrating a practical application of agentic AI for complex technical debugging.
New Yorker: Altman's OpenAI Rise Fueled by Persuasion, Dealmaking, Allegations
A New Yorker investigation alleges Sam Altman's leadership at OpenAI is built on persuasion, aggressive deals, and deception claims from insiders, linking the 2023 board drama to a fundamental shift away from safety-first ideals toward commercial scale.
New Yorker Exposes OpenAI's 'Merge & Assist' Clause, Internal Safety Conflicts
A New Yorker investigation details previously undisclosed 'Ilya Memos,' a secret 'merge and assist' clause for AGI rivals, and internal conflicts over safety compute allocation and governance.
How to Use Claude Code as a Diagnostic Agent for Complex, Multi-System Problems
A developer used Claude's reasoning to solve a 25-year medical mystery. Here's how to apply the same agentic, cross-domain analysis to your codebase.
UiPath Launches AI Agents for Retail Pricing, Promotions, and Stock Management
UiPath has announced new AI agents designed to autonomously handle core retail operations: dynamic pricing, promotional planning, and inventory gap resolution. This represents a significant move by a major automation player into agentic AI for retail.
How a Developer Used Claude Code to Reverse-Engineer a Bricked Smart Clock from Bare Metal
A developer used Claude Code as a co-pilot to reverse-engineer a dead LaMetric Time clock, creating a full USB-boot recovery system with no documentation.
RecThinker: An Agentic Framework for Tool-Augmented Reasoning in Recommendation
Researchers propose RecThinker, an LLM-based agentic framework that dynamically plans reasoning paths and proactively uses tools to fill information gaps for better recommendations. It shifts from passive processing to autonomous investigation, showing performance gains on benchmarks.
Open-Source AI Agent Revolutionizes Error Monitoring, Cuts Downtime by 95%
A new open-source AI agent autonomously scans production logs, identifies root causes of errors, and delivers contextual alerts via Slack before engineers notice issues. The tool reportedly reduces production downtime by 95%, transforming traditional debugging workflows.
The End of Online Anonymity: How LLMs Can Now Re-Identify Users from Just a Few Posts
Researchers from ETH Zürich and Anthropic have developed an automated pipeline that uses large language models to re-identify individuals from minimal online posts, fundamentally challenging the concept of digital anonymity.
NotebookLM's PowerPoint Integration: AI Research Assistant Evolves into Presentation Creator
Google's NotebookLM has expanded beyond research summarization to include slide generation and editing capabilities with direct PowerPoint export. This transforms the AI research assistant into a complete presentation workflow tool.