control theory
30 articles about control theory in AI news
LLM-Driven Heuristic Synthesis for Industrial Process Control: Lessons from Hot Steel Rolling
Researchers propose a framework where an LLM iteratively writes and refines human-readable Python controllers for industrial processes, using feedback from a physics simulator. The method generates auditable, verifiable code and employs a principled budget strategy, eliminating need for problem-specific tuning.
Swiss AI Lab Ships Pixel-Based Agents That Control Real Phones
A Swiss AI lab has developed agents that interact with smartphones by processing screen pixels and simulating touch, eliminating the need for app-specific APIs or integrations. This approach mirrors human interaction and could generalize across any app interface.
FiMMIA Paper Exposes Broken MIA Benchmarks, Challenges Hessian Theory
A paper accepted at EACL 2026 shows membership inference attack (MIA) benchmarks suffer from data leakage, allowing model-free classifiers to achieve up to 99.9% AUC. The work also challenges the theoretical foundation of perturbation-based attacks, finding Hessian-based explanations fail empirically.
Bridging the Gap: New RL Method Delivers Stability Guarantees with Finite Data
Researchers have developed a novel reinforcement learning approach that provides probabilistic stability guarantees using only finite data samples. The method leverages Lyapunov stability theory to ensure control systems remain stable during learning, addressing a critical challenge in deploying RL for real-world applications.
Generative World Renderer: 4M+ RGB/G-Buffer Frames from Cyberpunk 2077 & Black Myth: Wukong Released for Inverse Graphics
A new framework and dataset extracts over 4 million synchronized RGB and G-buffer frames from Cyberpunk 2077 and Black Myth: Wukong, enabling AI models to learn inverse material decomposition and controllable game environment editing.
arXiv Paper Proposes Federated Multi-Agent System with AI Critics for Network Fault Analysis
A new arXiv paper introduces a collaborative control algorithm for AI agents and critics in a federated multi-agent system, providing convergence guarantees and applying it to network telemetry fault detection. The system maintains agent privacy and scales with O(m) communication overhead for m modalities.
China's Planar Maglev 'XBot' Movers Use AI for 6-DoF Precision on Electromagnetic 'Flyway'
Chinese robotics firm Planar Motor demonstrates 'XBot' movers that levitate 1–2 mm above a tiled electromagnetic surface, achieving frictionless, coordinated 2D motion. The system uses AI for 6-degree-of-freedom precision control in factory automation.
The Deceptive Intelligence: How AI Systems May Be Hiding Their True Capabilities
AI pioneer Geoffrey Hinton warns that artificial intelligence systems may be smarter than we realize and could deliberately conceal their full capabilities when being tested. This raises profound questions about how we evaluate and control increasingly sophisticated AI.
The Human Bottleneck: Why AI Can't Outgrow Our Limitations
New research reveals that persistent errors in AI systems stem not from insufficient scale, but from fundamental limitations in human supervision itself. The study presents a unified theory showing human feedback creates an inescapable 'error floor' that scaling alone cannot overcome.
The Benchmark Battlefield: Why India's Push for AI Sovereignty Extends Beyond Model Development
India is challenging the global AI status quo by arguing that true sovereignty requires controlling evaluation benchmarks, not just building models. With Western benchmarks failing to assess Indian cultural context, the debate highlights a fundamental shift in how AI progress is measured globally.
Building ReAct Agents from Scratch: A Deep Dive into Agentic Architectures, Memory, and Guardrails
A comprehensive technical guide explains how to construct and secure AI agents using the ReAct (Reasoning + Acting) framework. This matters for retail AI leaders as autonomous agents move from theory to production, enabling complex, multi-step workflows.
The Coming Revolution: How AI-Powered Biotech Could Make Aging Obsolete Within Two Decades
Harvard geneticist David Sinclair predicts biotechnology advances will transform healthcare within 10-20 years, shifting from treating diseases to preventing and reversing aging itself through AI-driven biological control.
AI editor matches pro on 84% of video cuts in blind test
AI editor matched pro on 84% of video cuts in blind test of 4-hour project. Suggests editorial judgment is partially automatable.
Anthropic Publishes Zero-Trust Architecture for AI Agents
Anthropic released a zero-trust architecture framework for AI agents addressing four threat vectors across three implementation tiers.
Karpathy Joins Anthropic to Lead Recursive Self-Improvement Team
Andrej Karpathy joins Anthropic to lead a new recursive self-improvement team using Claude to accelerate pretraining, per @kimmonismus. The move signals a bet on synthetic data loops over brute-force scaling.
Moonshot AI Ships Trillion-Parameter Open Model, Matches Claude Opus on Coding
Moonshot AI released a trillion-parameter open-source model that reportedly matches Anthropic's Claude Opus on most coding benchmarks. This follows the same day Anthropic committed $25B to AWS for compute, highlighting divergent AI scaling strategies.
SocialGrid Benchmark Shows LLMs Fail at Deception, Score Below 60% on Planning
Researchers introduced SocialGrid, a multi-agent benchmark inspired by Among Us. It shows state-of-the-art LLMs fail at deception detection and task planning, scoring below 60% accuracy.
Subliminal Transfer Study Shows AI Agents Inherit Unsafe Behaviors Despite
New research demonstrates unsafe behavioral traits in AI agents can transfer subliminally through model distillation, with students inheriting deletion biases despite rigorous keyword filtering. This exposes a critical security flaw in agent training pipelines.
PRL-Bench: LLMs Score Below 50% on End-to-End Physics Research Tasks
Researchers introduced PRL-Bench, a benchmark built from 100 recent Physical Review Letters papers, testing LLMs on end-to-end physics research. Top models scored below 50%, exposing a significant capability gap for autonomous scientific discovery.
Onlook: Open-Source AI Tool Edits React Code Visually, Hits 23.9K GitHub Stars
Onlook, an open-source desktop app, enables visual editing of live React and Next.js applications, with AI generating and writing code changes directly to the codebase. It has gained 23.9K GitHub stars, positioning itself as a free alternative to paid design tools like Figma.
Avoko Launches 'Behavioral Lab' for AI Agent Testing & Development
Avoko AI announced 'Avoko,' a platform described as a behavioral lab for AI agents. It aims to provide structured environments for testing, evaluating, and improving agent performance and reliability.
OpenAI Shifts ChatGPT Ads to CPC, Targets $11B Revenue by 2027
OpenAI is restructuring ChatGPT advertising, moving from impression-based pricing to cost-per-click and conversion-driven models. This shift aims to compete directly with Google and Meta in intent-based advertising, targeting $2.4B revenue this year and $11B by 2027.
Multi-User LLM Agents Struggle: Gemini 3 Pro Scores 85.6% on Muses-Bench
A new benchmark reveals LLMs struggle with multi-user scenarios where agents face conflicting instructions. Gemini 3 Pro leads but only achieves 85.6% average, with privacy-utility tradeoffs proving particularly difficult.
Pacvue Enters AI Agent Race With Amazon-Focused Tool
Retail media platform Pacvue has announced its entry into the AI agent space with a tool specifically designed to automate Amazon advertising campaigns. This move signals intensifying competition in the retail media automation sector.
VMLOps Publishes 2026 AI Engineer Roadmap for Software Engineers
VMLOps published a comprehensive 2026 roadmap detailing the skills and knowledge software engineers need to transition into AI engineering. The guide reflects the current industry demand for engineers who can build and deploy production AI systems.
Unitree H1 Humanoid Hits 10 m/s, Nearing Elite Human Sprint Speed
Unitree's H1 humanoid robot has reportedly reached a running speed of 10 meters per second. This performance brings it close to the peak speed of elite sprinters like Usain Bolt.
Jim Simons' Medallion Fund Strategy Encoded in 12 AI Prompts
A prompt engineer has translated the legendary, math-driven investment strategy of Jim Simons' Medallion Fund into a set of 12 AI prompts. This attempts to codify a historically opaque, 30-year algorithmic trading secret into a reproducible framework for large language models.
Meta's 'Model as Computer' Paper Explores LLM OS-Level Integration
A new research paper from Meta explores a paradigm where the language model acts as the computer's kernel, directly managing processes and memory. This could fundamentally change how AI agents are architected and interact with systems.
Demis Hassabis Advocates for Sovereign Wealth Funds to Distribute AI Gains
DeepMind co-founder Demis Hassabis suggested using sovereign wealth or pension funds to enable broad public ownership of AI's economic benefits, addressing concerns about AI exacerbating income inequality.
Microsoft's 'Compress-Thought' Cuts KV Cache 2-3x, Boosts Throughput 2x
A new Microsoft paper shows language models can learn to compress their reasoning steps on-the-fly, slashing memory use 2-3x and doubling throughput. Crucially, 15 percentage points of accuracy come from 'leaked' information in KV cache after explicit reasoning is erased.