A May 1 arXiv paper from the Brave experiments team introduces AgentStop, a lightweight supervisor that cuts wasted energy from local AI agents by 15-20%. The method uses token-level log probabilities to predict and terminate trajectories unlikely to succeed, with less than 5% utility drop.
Key facts
- AgentStop reduces wasted energy by 15-20%.
- Utility drop is less than 5% on benchmarks.
- Paper posted to arXiv on May 1, 2026.
- Uses token-level log probabilities as signals.
- Code open-sourced on GitHub.
Local AI agents running on consumer devices burn significant energy on failed task trajectories. The AgentStop paper, posted to arXiv on May 1, 2026, measures that agentic workflows — iterative reasoning, tool use, and failure retries — increase GPU power draw, temperature, and battery drain compared to single-inference workloads. The authors propose a lightweight efficiency supervisor that monitors token-level log probabilities and other low-cost execution signals to preemptively terminate trajectories unlikely to complete successfully.
The Energy Problem of Local Agents
Deploying agents locally on user devices preserves privacy and eliminates API costs, but the resource overhead is substantial. [According to AgentStop] measurements show that agentic execution consumes significantly more compute than standard LLM interactions, often expending resources on tasks that ultimately fail. This inefficiency is a barrier to sustainable, privacy-preserving AI agents on consumer hardware.
How AgentStop Works
AgentStop acts as a supervisor that intercepts running agent trajectories. By analyzing cheap signals like token log-probabilities, it predicts whether the current trajectory will succeed. If the prediction is negative, it terminates the trajectory early, saving the remaining compute and energy. On challenging web-based question answering and coding benchmarks, AgentStop reduces wasted energy by 15-20% with less than 5% utility drop.

Why This Matters
The unique take: as the industry races to deploy AI agents on local devices — from laptops to phones — the energy cost of failed trajectories becomes a first-order constraint. AgentStop demonstrates that simple, cheap signals can recover a fifth of that wasted energy without meaningful performance degradation. This contrasts with more complex approaches like SAE-based probes (a related arXiv paper from May 7) that predict tool failures but require additional model overhead.

The approach is open-source, with code and data available at https://github.com/brave-experiments/AgentStop.
What to watch
Watch for follow-up work extending AgentStop to cloud-based agents, where API costs could be saved similarly, and for integration into consumer agent platforms like Brave's Leo or local LLM runners like Ollama.









