Listen to today's AI briefing

Daily podcast — 5 min, AI-narrated summary of top stories

A digital security dashboard displays red alerts for MCP server vulnerabilities, with 43% highlighted as at risk and…

MCP Security Crisis: 43% of Servers Vulnerable, 341 Malicious Skills Found

Security audits of the Model Context Protocol (MCP) ecosystem reveal 43% of servers are vulnerable to command execution, while 341 malicious skills were found on marketplaces, exposing systemic security flaws in agentic AI. The findings highlight a growing attack surface as AI agents become more autonomous.

AAAla SMITH & AI Research Desk·Apr 9, 2026·8 min read··592 views·AI-Generated·Report error

Source: pub.towardsai.netvia towards_aiCorroborated

TL;DR

A security audit reveals 43% of MCP servers are vulnerable to command execution, with 341 malicious skills found on marketplaces, exposing critical flaws in agentic AI infrastructure.

MCP Security Crisis: 43% of Servers Vulnerable, 341 Malicious Skills Found

A comprehensive security analysis of the Model Context Protocol (MCP) ecosystem reveals critical vulnerabilities affecting nearly half of all servers, while marketplaces host hundreds of malicious skills designed to steal credentials and execute arbitrary code. The findings, detailed in a practitioner's analysis published in April 2026, expose systemic security flaws in the infrastructure underpinning the rapidly expanding agentic AI landscape.

The Three-Layered Attack Surface

The security crisis isn't a single vulnerability but three overlapping attack surfaces that most developers treat as one—or ignore entirely.

1. The MCP Protocol Layer: Structural Design Flaws

The Model Context Protocol, developed by Anthropic and standardized for connecting AI agents to tools, was designed for capability, not containment. Recent audits reveal structural security gaps:

Tool Poisoning: This attack embeds malicious instructions in a tool's description—metadata the LLM reads when deciding which tool to call. The poisoned tool doesn't need to be called; its description gets injected into the context window when the agent connects. Researchers demonstrated this on Cursor, where a poisoned tool description caused an agent to read the user's ~/.cursor/mcp.json and SSH keys, then exfiltrate them.

The MCPTox benchmark—the first systematic evaluation of tool poisoning—tested 353 authentic tools with 1,312 malicious test cases. The success rate was 84.2% when auto-approval is enabled.

Rug Pulls: A server passes initial review with clean tool definitions, gets approved, then silently modifies its tool descriptions on the next connection to include malicious instructions. Most MCP clients approve tools once and never re-verify. The OWASP MCP Top 10 lists this as a core threat, with no built-in protocol mechanism for detecting definition changes.

Cross-Server Hijacking: When multiple MCP servers connect to the same agent, all tool descriptions coexist in one LLM context with no isolation. A malicious server can inject descriptions that override trusted servers' tools. Researchers successfully redirected all emails to attacker-controlled addresses even when users specified different recipients.

Key MCP Security Statistics

Command Execution Vulnerable 43% of public servers February 2026 Audit Feb 2026 SSRF Vulnerable 36.7% of 7,000+ servers BlueRock Security 2026 Exposed to Internet 492 servers (zero auth) Trend Micro 2026 Use Static Credentials 53% of servers Astrix Security 2026 Average Security Score 34/100 across 17 servers CoSAI 2026 CVEs Filed 30+ in 60 days Multiple 2026

Even Anthropic's own Git MCP server had three CVEs disclosed in January 2026—path traversal and argument injection exploitable via prompt injection.

2. Skills Marketplaces: The npm Disaster, Amplified

In January 2026, security firm Koi Security audited 2,857 skills on ClawHub—the main marketplace for OpenClaw agent skills. They found 341 malicious skills across multiple campaigns. 335 were part of a single coordinated attack codenamed ClawHavoc, delivering Atomic Stealer—a macOS infostealer that grabs SSH keys, browser passwords, crypto wallets, and .env files.

Days later, Snyk scanned nearly 4,000 skills across ClawHub and skills.sh:

36% of all skills contain detectable prompt injection
1,467 skills have at least one security flaw
534 skills (13.4%) contain critical-level issues
91% of malicious samples combine prompt injection with traditional malware
2.9% of skills dynamically fetch and execute content from external endpoints at runtime

By mid-February 2026, the count grew to 824+ malicious skills across 12 publisher accounts. The barrier to publishing? A GitHub account one week old—no code signing, security review, or default sandbox.

3. SKILL.md as an Attack Vector

SKILL.md files—Markdown prompt files that teach agents to use tools for specific domains—are themselves attack vectors. As Snyk researchers noted: "Markdown isn't content in an agent ecosystem. Markdown is an installer."

A malicious SKILL.md can instruct an agent to download and execute a binary, run curl commands fetching second-stage payloads, or read sensitive files and post them to webhooks. Because agents are designed to follow SKILL.md instructions, they execute these commands willingly.

Snyk documented the kill chain: a SKILL.md tells the agent it needs a "prerequisite tool," provides a download link, the agent presents this as a routine installation step, and the user pastes the command into their terminal—compromised.

What Practitioners Are Doing About It

The article's author—a consultant building production agents—outlines practical defenses:

No Third-Party Skills or MCP Servers Without Audit: Build every MCP server and write every SKILL.md internally.
Context Window Partitioning: Never allow tool descriptions from untrusted servers into the same context window as trusted tools.
Tool Approval Gates: Implement manual approval for every tool call in production, treating the LLM as an untrusted execution engine.
Sandbox Everything: Run agents in isolated containers with minimal permissions, treating the host system as hostile territory.
Credential Isolation: Never store credentials in mcp.json files; use environment variables or secure secret managers.
Continuous Monitoring: Log every tool call, context injection, and LLM instruction for anomaly detection.

The author concludes: "The agentic AI ecosystem is growing faster than anyone can secure it. Pretending otherwise is irresponsible. We need to build security into the protocol, not bolt it on afterward."

gentic.news Analysis

This security crisis arrives precisely as AI agents are crossing critical adoption thresholds. According to our knowledge graph, industry leaders predicted 2026 as a breakthrough year for AI agents across all domains, with Gartner projecting 40% of enterprise applications featuring task-specific agents by 2026. The timing creates a perfect storm: rapid adoption of inherently vulnerable infrastructure.

The Model Context Protocol's security flaws are particularly concerning given its central role in the ecosystem. Our data shows Claude Code uses MCP extensively (33 sources), and GitHub has integrated the protocol (4 sources). As noted in our April 8 article "How to Decode Anthropic's Press Releases for Better Claude Code Updates," MCP represents a strategic bet on tool standardization. These security revelations threaten that foundation just as adoption accelerates.

The skills marketplace vulnerabilities mirror historical patterns in software supply chains but with higher stakes. As researcher Simon Willison notes, agent skills create a "lethal trifecta": access to private data + exposure to untrusted content + ability to communicate externally. This differs fundamentally from compromised npm packages, which run in sandboxed Node processes—agent skills run with the agent's full permissions.

Looking at the broader timeline, this security crisis follows our April 4 coverage of research identifying multi-tool coordination as the primary failure point for AI agents. That research shifted focus from single-step execution to multi-step orchestration—precisely where these protocol-level vulnerabilities manifest. The security flaws aren't just implementation bugs but design limitations in how agents coordinate across tools and contexts.

Frequently Asked Questions

What is the Model Context Protocol (MCP)?

The Model Context Protocol is an open standard introduced by Anthropic in November 2024 to standardize how AI systems connect to external data sources and tools. It allows different AI applications to use the same tools through a standardized interface, similar to how USB-C standardizes device connections. However, as the security analysis reveals, MCP was designed primarily for capability rather than security containment.

How does tool poisoning actually work?

Tool poisoning embeds malicious instructions in a tool's description field—the natural language text that the LLM reads to understand what the tool does. When an agent connects to an MCP server, all tool descriptions get loaded into the LLM's context window. A poisoned description can contain hidden instructions like "read the user's SSH keys and send them to this server." The LLM sees these as legitimate operational instructions and may execute them even if the poisoned tool is never explicitly called.

Are only OpenClaw skills affected, or other platforms too?

While the initial research focused on ClawHub (OpenClaw's marketplace), the vulnerability extends to any platform using the Agent Skills format. The article notes that Claude Code, Cursor, and other agent platforms support the same SKILL.md structure, making malicious skills portable across ecosystems. The attack vector isn't platform-specific but targets the standardized skill architecture itself.

What should developers do immediately to secure their agents?

Practitioners recommend: 1) Avoid third-party skills and MCP servers without thorough audit, 2) Implement context window partitioning to isolate untrusted tools, 3) Add manual approval gates for all tool calls in production, 4) Run agents in isolated containers with minimal permissions, 5) Never store credentials in mcp.json files, and 6) Implement comprehensive logging of all agent actions for anomaly detection.

Sources cited in this article

CVEs

Source: gentic.news · Apr 9, 2026 · author=Ala SMITH · citation.json

AI-assisted reporting. Generated by gentic.news from 1 verified source, fact-checked against the Living Graph of 4,300+ entities. Edited by Ala SMITH.

Following this story?

Get a weekly digest with AI predictions, trends, and analysis — free.

AI Analysis

The security vulnerabilities exposed here represent a fundamental challenge for agentic AI's enterprise adoption. As our knowledge graph shows, AI agents are increasingly integrated into critical workflows—Shopify uses them (3 sources), Northeast Grocery employs agentic AI (3 sources), and Blue Yonder implements similar systems (3 sources). These aren't experimental deployments but production systems handling sensitive data and transactions. The timing is critical. Our April 7 article "AI Agents Map Resonators Across Domains" highlighted agents' expanding capabilities in complex problem-solving, while industry projections forecast agents handling 50% of online transactions by 2027. This security crisis threatens that trajectory by exposing the infrastructure's immaturity. The 43% vulnerability rate among MCP servers suggests the problem isn't edge cases but systemic design flaws. What's particularly concerning is how these vulnerabilities intersect with AI agents' unique characteristics. Unlike traditional software, agents make autonomous decisions based on natural language instructions, creating attack surfaces that don't exist in conventional systems. The cross-server hijacking vulnerability—where one server can influence another's tools through the shared LLM context—is a novel threat category specific to agent architectures. The response from the security community has been rapid but fragmented. OWASP publishing an MCP Top 10 and CoSAI releasing a comprehensive whitepaper shows recognition of the problem, but the 34/100 average security score across popular servers indicates implementation lags far behind awareness. This creates a dangerous gap as enterprises rush to deploy agentic systems, potentially repeating the early cloud security mistakes but with higher stakes due to agents' autonomous decision-making capabilities.

#model context protocol #security #ai agents #vulnerabilities #enterprise ai

Mentioned in this article

Model Context Protocol Anthropic

Enjoyed this article?