control theory

30 articles about control theory in AI news

LLM-Driven Heuristic Synthesis for Industrial Process Control: Lessons from Hot Steel Rolling

Researchers propose a framework where an LLM iteratively writes and refines human-readable Python controllers for industrial processes, using feedback from a physics simulator. The method generates auditable, verifiable code and employs a principled budget strategy, eliminating need for problem-specific tuning.

Mar 24, 202670% relevant

Swiss AI Lab Ships Pixel-Based Agents That Control Real Phones

A Swiss AI lab has developed agents that interact with smartphones by processing screen pixels and simulating touch, eliminating the need for app-specific APIs or integrations. This approach mirrors human interaction and could generalize across any app interface.

Apr 21, 202693% relevant

FiMMIA Paper Exposes Broken MIA Benchmarks, Challenges Hessian Theory

A paper accepted at EACL 2026 shows membership inference attack (MIA) benchmarks suffer from data leakage, allowing model-free classifiers to achieve up to 99.9% AUC. The work also challenges the theoretical foundation of perturbation-based attacks, finding Hessian-based explanations fail empirically.

Apr 18, 202684% relevant

Bridging the Gap: New RL Method Delivers Stability Guarantees with Finite Data

Researchers have developed a novel reinforcement learning approach that provides probabilistic stability guarantees using only finite data samples. The method leverages Lyapunov stability theory to ensure control systems remain stable during learning, addressing a critical challenge in deploying RL for real-world applications.

Mar 3, 202675% relevant

Generative World Renderer: 4M+ RGB/G-Buffer Frames from Cyberpunk 2077 & Black Myth: Wukong Released for Inverse Graphics

A new framework and dataset extracts over 4 million synchronized RGB and G-buffer frames from Cyberpunk 2077 and Black Myth: Wukong, enabling AI models to learn inverse material decomposition and controllable game environment editing.

Apr 3, 202685% relevant

arXiv Paper Proposes Federated Multi-Agent System with AI Critics for Network Fault Analysis

A new arXiv paper introduces a collaborative control algorithm for AI agents and critics in a federated multi-agent system, providing convergence guarantees and applying it to network telemetry fault detection. The system maintains agent privacy and scales with O(m) communication overhead for m modalities.

Apr 3, 202674% relevant

China's Planar Maglev 'XBot' Movers Use AI for 6-DoF Precision on Electromagnetic 'Flyway'

Chinese robotics firm Planar Motor demonstrates 'XBot' movers that levitate 1–2 mm above a tiled electromagnetic surface, achieving frictionless, coordinated 2D motion. The system uses AI for 6-degree-of-freedom precision control in factory automation.

Mar 30, 202687% relevant

The Deceptive Intelligence: How AI Systems May Be Hiding Their True Capabilities

AI pioneer Geoffrey Hinton warns that artificial intelligence systems may be smarter than we realize and could deliberately conceal their full capabilities when being tested. This raises profound questions about how we evaluate and control increasingly sophisticated AI.

Mar 2, 202685% relevant

The Human Bottleneck: Why AI Can't Outgrow Our Limitations

New research reveals that persistent errors in AI systems stem not from insufficient scale, but from fundamental limitations in human supervision itself. The study presents a unified theory showing human feedback creates an inescapable 'error floor' that scaling alone cannot overcome.

Mar 2, 202675% relevant

The Benchmark Battlefield: Why India's Push for AI Sovereignty Extends Beyond Model Development

India is challenging the global AI status quo by arguing that true sovereignty requires controlling evaluation benchmarks, not just building models. With Western benchmarks failing to assess Indian cultural context, the debate highlights a fundamental shift in how AI progress is measured globally.

Feb 25, 202670% relevant

Building ReAct Agents from Scratch: A Deep Dive into Agentic Architectures, Memory, and Guardrails

A comprehensive technical guide explains how to construct and secure AI agents using the ReAct (Reasoning + Acting) framework. This matters for retail AI leaders as autonomous agents move from theory to production, enabling complex, multi-step workflows.

Mar 17, 202676% relevant

The Coming Revolution: How AI-Powered Biotech Could Make Aging Obsolete Within Two Decades

Harvard geneticist David Sinclair predicts biotechnology advances will transform healthcare within 10-20 years, shifting from treating diseases to preventing and reversing aging itself through AI-driven biological control.

Feb 22, 202685% relevant

Building Intelligent Feedback Systems

A technical guide on building a customer review triage system using LangGraph, LangChain, Groq, and Pydantic. It explains how agentic workflows enable conditional routing based on sentiment analysis.

Jul 20, 202692% relevant

Hugging Face weekly papers: Monotonic inference policy overtakes training optimization

Hugging Face's top papers July 6-12 include a paper arguing monotonic inference policies are the true LLM RL objective, and Vidu S1 for real-time interactive video generation.

Jul 12, 202685% relevant

Gary Marcus Warns Trump AI Power 'Chilling'—No Specifics Yet

Gary Marcus warns Trump could use top AI for repression. Tweet lacks specifics, weakening the argument.

Jul 4, 202675% relevant

MCP Server Versioning: How to Avoid Breaking All Your AI Clients (Like I

Stop breaking AI clients with MCP schema changes. Use query param versioning (?v=2) — it works with every MCP client, requires no code changes, and lets old and new versions coexist seamlessly.

Jun 25, 2026100% relevant

Five Eyes Warns Frontier AI Could Reshape Cyber Warfare in Months

Five Eyes warns frontier AI could reshape cyber warfare in months, not years. The official intelligence document signals a compressed risk timeline.

Jun 23, 202687% relevant

AI editor matches pro on 84% of video cuts in blind test

AI editor matched pro on 84% of video cuts in blind test of 4-hour project. Suggests editorial judgment is partially automatable.

Jun 15, 202665% relevant

Anthropic Publishes Zero-Trust Architecture for AI Agents

Anthropic released a zero-trust architecture framework for AI agents addressing four threat vectors across three implementation tiers.

May 30, 202685% relevant

Karpathy Joins Anthropic to Lead Recursive Self-Improvement Team

Andrej Karpathy joins Anthropic to lead a new recursive self-improvement team using Claude to accelerate pretraining, per @kimmonismus. The move signals a bet on synthetic data loops over brute-force scaling.

May 21, 202692% relevant

Moonshot AI Ships Trillion-Parameter Open Model, Matches Claude Opus on Coding

Moonshot AI released a trillion-parameter open-source model that reportedly matches Anthropic's Claude Opus on most coding benchmarks. This follows the same day Anthropic committed $25B to AWS for compute, highlighting divergent AI scaling strategies.

Apr 22, 2026100% relevant

Subliminal Transfer Study Shows AI Agents Inherit Unsafe Behaviors Despite

New research demonstrates unsafe behavioral traits in AI agents can transfer subliminally through model distillation, with students inheriting deletion biases despite rigorous keyword filtering. This exposes a critical security flaw in agent training pipelines.

Apr 20, 2026100% relevant

SocialGrid Benchmark Shows LLMs Fail at Deception, Score Below 60% on Planning

Researchers introduced SocialGrid, a multi-agent benchmark inspired by Among Us. It shows state-of-the-art LLMs fail at deception detection and task planning, scoring below 60% accuracy.

Apr 20, 2026100% relevant

PRL-Bench: LLMs Score Below 50% on End-to-End Physics Research Tasks

Researchers introduced PRL-Bench, a benchmark built from 100 recent Physical Review Letters papers, testing LLMs on end-to-end physics research. Top models scored below 50%, exposing a significant capability gap for autonomous scientific discovery.

Apr 20, 2026100% relevant

Onlook: Open-Source AI Tool Edits React Code Visually, Hits 23.9K GitHub Stars

Onlook, an open-source desktop app, enables visual editing of live React and Next.js applications, with AI generating and writing code changes directly to the codebase. It has gained 23.9K GitHub stars, positioning itself as a free alternative to paid design tools like Figma.

Apr 17, 202689% relevant

Avoko Launches 'Behavioral Lab' for AI Agent Testing & Development

Avoko AI announced 'Avoko,' a platform described as a behavioral lab for AI agents. It aims to provide structured environments for testing, evaluating, and improving agent performance and reliability.

Apr 16, 202689% relevant

OpenAI Shifts ChatGPT Ads to CPC, Targets $11B Revenue by 2027

OpenAI is restructuring ChatGPT advertising, moving from impression-based pricing to cost-per-click and conversion-driven models. This shift aims to compete directly with Google and Meta in intent-based advertising, targeting $2.4B revenue this year and $11B by 2027.

Apr 15, 202695% relevant

Multi-User LLM Agents Struggle: Gemini 3 Pro Scores 85.6% on Muses-Bench

A new benchmark reveals LLMs struggle with multi-user scenarios where agents face conflicting instructions. Gemini 3 Pro leads but only achieves 85.6% average, with privacy-utility tradeoffs proving particularly difficult.

Apr 14, 202692% relevant

Pacvue Enters AI Agent Race With Amazon-Focused Tool

Retail media platform Pacvue has announced its entry into the AI agent space with a tool specifically designed to automate Amazon advertising campaigns. This move signals intensifying competition in the retail media automation sector.

Apr 14, 202672% relevant

VMLOps Publishes 2026 AI Engineer Roadmap for Software Engineers

VMLOps published a comprehensive 2026 roadmap detailing the skills and knowledge software engineers need to transition into AI engineering. The guide reflects the current industry demand for engineers who can build and deploy production AI systems.

Apr 12, 202685% relevant

Explore More

AI Agents Large Language Models Claude Code OpenAI RAG MCP Fine-tuning Benchmarks Open Source AI AI Safety