Skip to content
gentic.news — AI News Intelligence Platform
Connecting to the Living Graph…

Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

A developer reviews an AI agent configuration file, comparing its tool allowlist against actual agent behavior in a…
Open SourceScore: 57

How to Automatically Verify Agent Allowlists Match Behavior with a

Use a body-vs-allowlist cross-checker in your agent validator, versioned with each tightening rule, so your 317+ agents declare what they actually do — not what you hope they do.

·5d ago·3 min read··17 views·AI-Generated·Report error
Share:
Source: dev.tovia devto_claudecodeCorroborated
How do I automatically verify my Claude Code agent's tool allowlist matches what its body actually does?

Use a schema-versioned validator that reads both frontmatter and body, strips code blocks, then cross-checks every tool mentioned in the body against the allowlist. Bump the schema version with each tightening rule so drift is auditable.

TL;DR

Your agent tool allowlist is just a comment until a runtime gate cross-checks it against the body's actual tool calls.

The Problem: Your Allowlist Is a Comment Until It's Checked

Every Claude Code agent has two surfaces: the frontmatter (tools, name, description) and the body (the prose that tells the agent what to do). The tools field is a runtime allowlist — tools not listed are blocked. But nothing proves the body's actual tool calls match that list.

Two silent failures emerge:

  1. Under-declaration: The body says "collect analytics with umami tools," but tools only lists Read. Every umami call is blocked at runtime. The agent looks valid, passes schema checks, and silently can't do its job.

  2. Over-declaration: The allowlist grants tools the body never uses. The blast radius is overstated — exactly the wrong thing for an allowlist to do.

Neither shows up in a frontmatter-only validator. You can check tools is a well-formed list of real tool names and still have an allowlist that has nothing to do with what the agent does.

The Fix: A Schema-Versioned Body-vs-Allowlist Cross-Check

The solution from the source repo (317 agents) is a validator that reads both frontmatter and body, then cross-checks them. Here's the approach:

Step 1: Strip Code Blocks First

Code blocks contain example tool calls that aren't actual behavior. Strip them before analysis:

body = strip_fenced_code(body)  # Remove ``` blocks so examples don't read as tool calls

Step 2: Cross-Check Tools Mentioned in Body Against Allowlist

def check_agent_body_vs_allowlist(frontmatter, body):
    body = strip_fenced_code(body)
    declared = mcp_tools_in(frontmatter)  # Tools in the allowlist
    used = mcp_tools_in(body)             # Tools mentioned in body prose
    
    undeclared = used - declared          # Body uses tools not in allowlist
    overgranted = declared - used         # Allowlist grants tools body never uses
    
    return undeclared, overgranted

Step 3: Version the Gate

Every change to what the validator accepts bumps SCHEMA_VERSION and lands a changelog entry. When an agent that passed last month fails today, the version delta tells you which rule moved and where it was signed off.

The progression from the source:

  • Schema 3.10.0: Flipped from lenient to kernel-strict — required name, description, tools, model, color, version, author, tags as errors
  • Schema 3.11.0: Added the body-vs-allowlist consistency check

Step 4: Land the Gate with the Fleet

CI goes red until all agents are remediated in the same branch. The gate and the fleet land together. A gate you merge before the fleet conforms is just a broken main with a TODO.

Use a two-pass approach:

  1. Green pass: Dumb, deterministic remediation — every agent gets a full canonical tool set, scaffolded tags, version: 1.0.0. Nothing clever, just enough to satisfy the new required set.
  2. A-grade pass: Narrow to least privilege — rewrite description into capability + "Use when…" + trigger phrases, replace scaffolded tags with real lowercase topic tags, narrow tools from full-canonical to least-privilege (median: 5 tools, zero agents retaining the full 10-tool set).

Why This Matters for Claude Code Users

If you maintain multiple agent files (in a team, repo, or organization)


Source: dev.to

Source: gentic.news · · author= · citation.json

AI-assisted reporting. Generated by gentic.news from multiple verified sources, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

This story is part of
Claude Code's Campus Conquest Flips Anthropic's Talent Pipeline, Leaving Google's Academic Edge in Doubt
Viral adoption at MIT and Stanford transforms Claude Code from product into recruiting funnel, threatening Google's long-held research talent dominance

Mentioned in this article

Enjoyed this article?
Share:

AI Toolslive

Five one-click lenses on this article. Cached for 24h.

Pick a tool above to generate an instant lens on this article.

Related Articles

From the lab

The framework underneath this story

Every article on this site sits on top of one engine and one framework — both built by the lab.

More in Open Source

View all