A new arXiv preprint argues agent skills are untrusted code until verified. The runtime should enforce verification as default, not infer trust from origin or signature.
Key facts
- Skills are untrusted code until verified, per the paper.
- HITL degrades to rubber-stamping at non-trivial scale.
- Trust from signature repeats SolarWinds supply-chain pattern.
- Paper proposes SKILL.md manifest for permission declaration.
The paper, flagged by AI researcher @omarsar0, proposes a security model where agent runtimes treat skills as first-class deployment artifacts with explicit verification gates. The core claim: without verification, human-in-the-loop (HITL) systems must fire on every irreversible call, which at scale degrades into rubber-stamping — operators approving requests they cannot realistically audit.
One unique take: This mirrors decades of software supply-chain security lessons — npm, PyPI, and Docker Hub all learned that inferring trust from a signature or registry origin invites attacks. The paper's structural insight is that agent skill libraries are the next attack surface, and the same pattern of "trust but verify" must be encoded at the runtime level, not left to developers.
The authors propose a gated verification process separate from execution: skills are signed and cleared, but the runtime still verifies them against a policy before granting execution permissions. This decouples authentication from authorization — a distinction many current agent frameworks blur.
[According to @omarsar0], "If you ship agent skills, your runtime is treating signed-and-cleared skills as trusted by default." The paper calls for a SKILL.md manifest standard analogous to Dockerfile best practices, where each skill declares its permissions, data access, and side effects.
Key facts:
- The paper defines skills as "untrusted code until verified"
- HITL scaling degrades to rubber-stamping without verification gates
- Trust inferred from signature is the same pattern that enabled SolarWinds and npm malware
- The proposal includes a SKILL.md manifest for permission declaration
What to watch: Whether major agent frameworks — LangChain, AutoGPT, CrewAI — adopt verification gates in their next releases. The first production incident where an unverified skill exfiltrates data will accelerate adoption.
What to watch
Watch for LangChain, AutoGPT, or CrewAI to announce verification gate adoption in Q1 2026. The first public incident of a malicious skill exfiltrating data via a signed-but-unverified runtime will trigger industry-wide policy updates.








