investigative

13 articles about investigative in AI news

German Media's AI 'Stupidity' Cover Sparks Debate on National Tech Pessimism

A DER SPIEGEL magazine cover asking 'How much is AI making us all stupid?' has drawn criticism for exemplifying Germany's pessimistic 'Angst'-driven narrative around technology, contrasting with calls for a more opportunity-focused discourse.

Apr 18, 202675% relevant

RiskWebWorld: A New Benchmark Exposes the Limits of AI for E-commerce Risk

Researchers introduced RiskWebWorld, a realistic benchmark for testing GUI agents on 1,513 authentic e-commerce risk management tasks. It reveals a major capability gap, showing even the best models fail over 50% of the time, highlighting the immaturity of AI for high-stakes operational automation.

Apr 17, 202692% relevant

Google's 'TestPilot' AI Agent Debugs Integration Tests from Logs

Google introduced TestPilot, an AI agent that diagnoses integration test failures by sifting through logs and suggesting code fixes. It autonomously resolved 15% of real-world Python test failures in an experiment.

Apr 17, 202685% relevant

Claude MCP GPU Debugging: AI Agent Identifies PyTorch Bottleneck in Kernel

A developer used an AI agent powered by Claude Code and the Model Context Protocol (MCP) to diagnose a severe GPU performance bottleneck. The agent analyzed system kernel traces, pinpointing excessive CPU context switches as the culprit, demonstrating a practical application of agentic AI for complex technical debugging.

Apr 16, 202672% relevant

New Yorker: Altman's OpenAI Rise Fueled by Persuasion, Dealmaking, Allegations

A New Yorker investigation alleges Sam Altman's leadership at OpenAI is built on persuasion, aggressive deals, and deception claims from insiders, linking the 2023 board drama to a fundamental shift away from safety-first ideals toward commercial scale.

Apr 6, 202695% relevant

New Yorker Exposes OpenAI's 'Merge & Assist' Clause, Internal Safety Conflicts

A New Yorker investigation details previously undisclosed 'Ilya Memos,' a secret 'merge and assist' clause for AGI rivals, and internal conflicts over safety compute allocation and governance.

Apr 6, 202695% relevant

How to Use Claude Code as a Diagnostic Agent for Complex, Multi-System Problems

A developer used Claude's reasoning to solve a 25-year medical mystery. Here's how to apply the same agentic, cross-domain analysis to your codebase.

Mar 26, 202684% relevant

UiPath Launches AI Agents for Retail Pricing, Promotions, and Stock Management

UiPath has announced new AI agents designed to autonomously handle core retail operations: dynamic pricing, promotional planning, and inventory gap resolution. This represents a significant move by a major automation player into agentic AI for retail.

Mar 25, 202695% relevant

How a Developer Used Claude Code to Reverse-Engineer a Bricked Smart Clock from Bare Metal

A developer used Claude Code as a co-pilot to reverse-engineer a dead LaMetric Time clock, creating a full USB-boot recovery system with no documentation.

Mar 24, 202698% relevant

RecThinker: An Agentic Framework for Tool-Augmented Reasoning in Recommendation

Researchers propose RecThinker, an LLM-based agentic framework that dynamically plans reasoning paths and proactively uses tools to fill information gaps for better recommendations. It shifts from passive processing to autonomous investigation, showing performance gains on benchmarks.

Mar 11, 202695% relevant

Open-Source AI Agent Revolutionizes Error Monitoring, Cuts Downtime by 95%

A new open-source AI agent autonomously scans production logs, identifies root causes of errors, and delivers contextual alerts via Slack before engineers notice issues. The tool reportedly reduces production downtime by 95%, transforming traditional debugging workflows.

Mar 3, 202685% relevant

The End of Online Anonymity: How LLMs Can Now Re-Identify Users from Just a Few Posts

Researchers from ETH Zürich and Anthropic have developed an automated pipeline that uses large language models to re-identify individuals from minimal online posts, fundamentally challenging the concept of digital anonymity.

Feb 26, 202695% relevant

NotebookLM's PowerPoint Integration: AI Research Assistant Evolves into Presentation Creator

Google's NotebookLM has expanded beyond research summarization to include slide generation and editing capabilities with direct PowerPoint export. This transforms the AI research assistant into a complete presentation workflow tool.

Feb 26, 202685% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety