A comprehensive security analysis of the Model Context Protocol (MCP) ecosystem reveals critical vulnerabilities affecting nearly half of all servers, while marketplaces host hundreds of malicious skills designed to steal credentials and execute arbitrary code. The findings, detailed in a practitioner's analysis published in April 2026, expose systemic security flaws in the infrastructure underpinning the rapidly expanding agentic AI landscape.
The Three-Layered Attack Surface
The security crisis isn't a single vulnerability but three overlapping attack surfaces that most developers treat as one—or ignore entirely.
1. The MCP Protocol Layer: Structural Design Flaws
The Model Context Protocol, developed by Anthropic and standardized for connecting AI agents to tools, was designed for capability, not containment. Recent audits reveal structural security gaps:
Tool Poisoning: This attack embeds malicious instructions in a tool's description—metadata the LLM reads when deciding which tool to call. The poisoned tool doesn't need to be called; its description gets injected into the context window when the agent connects. Researchers demonstrated this on Cursor, where a poisoned tool description caused an agent to read the user's ~/.cursor/mcp.json and SSH keys, then exfiltrate them.
The MCPTox benchmark—the first systematic evaluation of tool poisoning—tested 353 authentic tools with 1,312 malicious test cases. The success rate was 84.2% when auto-approval is enabled.
Rug Pulls: A server passes initial review with clean tool definitions, gets approved, then silently modifies its tool descriptions on the next connection to include malicious instructions. Most MCP clients approve tools once and never re-verify. The OWASP MCP Top 10 lists this as a core threat, with no built-in protocol mechanism for detecting definition changes.
Cross-Server Hijacking: When multiple MCP servers connect to the same agent, all tool descriptions coexist in one LLM context with no isolation. A malicious server can inject descriptions that override trusted servers' tools. Researchers successfully redirected all emails to attacker-controlled addresses even when users specified different recipients.
Key MCP Security Statistics
Command Execution Vulnerable 43% of public servers February 2026 Audit Feb 2026 SSRF Vulnerable 36.7% of 7,000+ servers BlueRock Security 2026 Exposed to Internet 492 servers (zero auth) Trend Micro 2026 Use Static Credentials 53% of servers Astrix Security 2026 Average Security Score 34/100 across 17 servers CoSAI 2026 CVEs Filed 30+ in 60 days Multiple 2026Even Anthropic's own Git MCP server had three CVEs disclosed in January 2026—path traversal and argument injection exploitable via prompt injection.
2. Skills Marketplaces: The npm Disaster, Amplified
In January 2026, security firm Koi Security audited 2,857 skills on ClawHub—the main marketplace for OpenClaw agent skills. They found 341 malicious skills across multiple campaigns. 335 were part of a single coordinated attack codenamed ClawHavoc, delivering Atomic Stealer—a macOS infostealer that grabs SSH keys, browser passwords, crypto wallets, and .env files.
Days later, Snyk scanned nearly 4,000 skills across ClawHub and skills.sh:
- 36% of all skills contain detectable prompt injection
- 1,467 skills have at least one security flaw
- 534 skills (13.4%) contain critical-level issues
- 91% of malicious samples combine prompt injection with traditional malware
- 2.9% of skills dynamically fetch and execute content from external endpoints at runtime
By mid-February 2026, the count grew to 824+ malicious skills across 12 publisher accounts. The barrier to publishing? A GitHub account one week old—no code signing, security review, or default sandbox.
3. SKILL.md as an Attack Vector
SKILL.md files—Markdown prompt files that teach agents to use tools for specific domains—are themselves attack vectors. As Snyk researchers noted: "Markdown isn't content in an agent ecosystem. Markdown is an installer."
A malicious SKILL.md can instruct an agent to download and execute a binary, run curl commands fetching second-stage payloads, or read sensitive files and post them to webhooks. Because agents are designed to follow SKILL.md instructions, they execute these commands willingly.
Snyk documented the kill chain: a SKILL.md tells the agent it needs a "prerequisite tool," provides a download link, the agent presents this as a routine installation step, and the user pastes the command into their terminal—compromised.
What Practitioners Are Doing About It
The article's author—a consultant building production agents—outlines practical defenses:
- No Third-Party Skills or MCP Servers Without Audit: Build every MCP server and write every SKILL.md internally.
- Context Window Partitioning: Never allow tool descriptions from untrusted servers into the same context window as trusted tools.
- Tool Approval Gates: Implement manual approval for every tool call in production, treating the LLM as an untrusted execution engine.
- Sandbox Everything: Run agents in isolated containers with minimal permissions, treating the host system as hostile territory.
- Credential Isolation: Never store credentials in mcp.json files; use environment variables or secure secret managers.
- Continuous Monitoring: Log every tool call, context injection, and LLM instruction for anomaly detection.
The author concludes: "The agentic AI ecosystem is growing faster than anyone can secure it. Pretending otherwise is irresponsible. We need to build security into the protocol, not bolt it on afterward."
gentic.news Analysis
This security crisis arrives precisely as AI agents are crossing critical adoption thresholds. According to our knowledge graph, industry leaders predicted 2026 as a breakthrough year for AI agents across all domains, with Gartner projecting 40% of enterprise applications featuring task-specific agents by 2026. The timing creates a perfect storm: rapid adoption of inherently vulnerable infrastructure.
The Model Context Protocol's security flaws are particularly concerning given its central role in the ecosystem. Our data shows Claude Code uses MCP extensively (33 sources), and GitHub has integrated the protocol (4 sources). As noted in our April 8 article "How to Decode Anthropic's Press Releases for Better Claude Code Updates," MCP represents a strategic bet on tool standardization. These security revelations threaten that foundation just as adoption accelerates.
The skills marketplace vulnerabilities mirror historical patterns in software supply chains but with higher stakes. As researcher Simon Willison notes, agent skills create a "lethal trifecta": access to private data + exposure to untrusted content + ability to communicate externally. This differs fundamentally from compromised npm packages, which run in sandboxed Node processes—agent skills run with the agent's full permissions.
Looking at the broader timeline, this security crisis follows our April 4 coverage of research identifying multi-tool coordination as the primary failure point for AI agents. That research shifted focus from single-step execution to multi-step orchestration—precisely where these protocol-level vulnerabilities manifest. The security flaws aren't just implementation bugs but design limitations in how agents coordinate across tools and contexts.
Frequently Asked Questions
What is the Model Context Protocol (MCP)?
The Model Context Protocol is an open standard introduced by Anthropic in November 2024 to standardize how AI systems connect to external data sources and tools. It allows different AI applications to use the same tools through a standardized interface, similar to how USB-C standardizes device connections. However, as the security analysis reveals, MCP was designed primarily for capability rather than security containment.
How does tool poisoning actually work?
Tool poisoning embeds malicious instructions in a tool's description field—the natural language text that the LLM reads to understand what the tool does. When an agent connects to an MCP server, all tool descriptions get loaded into the LLM's context window. A poisoned description can contain hidden instructions like "read the user's SSH keys and send them to this server." The LLM sees these as legitimate operational instructions and may execute them even if the poisoned tool is never explicitly called.
Are only OpenClaw skills affected, or other platforms too?
While the initial research focused on ClawHub (OpenClaw's marketplace), the vulnerability extends to any platform using the Agent Skills format. The article notes that Claude Code, Cursor, and other agent platforms support the same SKILL.md structure, making malicious skills portable across ecosystems. The attack vector isn't platform-specific but targets the standardized skill architecture itself.
What should developers do immediately to secure their agents?
Practitioners recommend: 1) Avoid third-party skills and MCP servers without thorough audit, 2) Implement context window partitioning to isolate untrusted tools, 3) Add manual approval gates for all tool calls in production, 4) Run agents in isolated containers with minimal permissions, 5) Never store credentials in mcp.json files, and 6) Implement comprehensive logging of all agent actions for anomaly detection.








