Tech News

OpenClaw: The AI Agent Backdoor – A Supply-Chain Scanner’s Blind Spot

Published

2 months ago

May 5, 2026

One command turns any open-source repo into an AI agent backdoor. OpenClaw proved no supply-chain scanner has a detection category for it

Just two months ago, researchers at the Data Intelligence Lab at the University of Hong Kong introduced CLI-Anything, a new state-of-the-art tool that analyzes any repo’s source code and generates a structured command line interface (CLI) that AI coding agents can operate with a single command.

Claude Code, Codex, OpenClaw, Cursor, and GitHub Copilot CLI are all supported, and since its launch in March, CLI‑Anything has climbed to more than 30,000 GitHub stars.

But the same mechanism that makes software agent-native opens the door to agent-level poisoning. The attack community is already discussing the implications on X and security forums, translating CLI-Anything’s architecture into offensive playbooks.

The security problem is not what CLI-Anything does. It is what CLI-Anything represents.

CLI-Anything generates SKILL.md files, the same instruction-layer artifacts that Snyk’s ToxicSkills research found laced with 76 confirmed malicious payloads across ClawHub and skills.sh in February 2026. A poisoned skill definition does not trigger a CVE and never appears in a software bill of materials (SBOM). No mainstream security scanner has a detection category for malicious instructions embedded in agent skill definitions, because the category simply did not exist eighteen months ago.

Cisco confirmed the gap in April. “Traditional application security tools were not designed for this,” Cisco’s engineering team wrote in a blog post announcing its AI Agent Security Scanner for IDEs. “SAST [static application security testing] scanners analyze source code syntax. SCA [software composition analysis] tools check dependency versions. Neither understands the semantic layer where MCP [Model Context Protocol] tool descriptions, agent prompts, and skill definitions operate.”

Merritt Baer, CSO of Enkrypt AI and former Deputy CISO at Amazon Web Services (AWS), told VentureBeat in an exclusive interview: “SAST and SCA were built for code and dependencies. They don’t inspect instructions.”

This is not a single-vendor vulnerability. It is a structural gap in how the entire security industry monitors software supply chains. This is the pre-exploitation window. CLI-Anything is live, the attack community is discussing it, and security directors who act now get ahead of the first incident report.

The integration layer no stack can see

Traditional supply-chain security operates on two layers. The code layer is where SAST works, scanning source files for insecure patterns, injection flaws, and hardcoded secrets. The dependency layer is where SCA works, checking package versions against known vulnerabilities, generating SBOMs, and flagging outdated libraries.

Agent bridge tools like CLI-Anything, MCP connectors, Cursor rules files, and Claude Code skills operate on a third layer between the other two. Call it the agent integration layer: configuration files, skill definitions, and natural-language instruction sets tell an AI agent what software can do and how to operate it. None of it looks like code. All of it executes like code.

Carter Rees, VP of AI at Reputation, told VentureBeat in an exclusive interview: “Modern LLMs [large language models] rely on third-party plugins, introducing supply chain vulnerabilities where compromised tools can inject malicious data into the conversation flow, bypassing internal safety training.”

Researchers at Griffith University, Nanyang Technological University, the University of New South Wales, and the University of Tokyo documented the attack chain in an April paper, “Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems.” The team introduced Document-Driven Implicit Payload Execution (DDIPE), a technique that embeds malicious logic inside code examples within skill documentation.

Across four agent frameworks and five large language models, DDIPE achieved bypass rates between 11.6% and 33.5%. Static analysis caught most samples, but 2.5% evaded all four detection layers. Responsible disclosure led to four confirmed vulnerabilities and two vendor fixes.

The kill chain security leaders need to audit

Here’s the anatomy of the kill chain: An attacker submits a SKILL.md file to an open-source project containing setup instructions, code examples, and configuration templates. It looks like standard documentation. A code reviewer would wave it through because none of it is executable. But the code examples contain embedded instructions that an agent will parse as operational directives.

A developer uses an agent bridge tool to connect their coding agent to the repository. The agent ingests the skill definition and trusts it, because no verification layer exists to distinguish benign from malicious intent at the instruction level.

The agent executes the embedded instruction using its own legitimate credentials. Endpoint detection and response (EDR) sees an approved API call from an authorized process and passes it. Data exfiltration, configuration changes, and credential harvesting are all moving through channels that the monitoring stack considers normal traffic.

Rees identified the structural flaw that makes this chain lethal. “A significant vulnerability in enterprise AI is broken access control, where the flat authorization plane of an LLM fails to respect user permissions,” he told VentureBeat. A compromised skill definition riding that flat authorization plane does not need to escalate privileges. It already has them. Every link in that chain is invisible to the current security stack.

Pillar Security demonstrated a variant of this chain against Cursor in January 2026 (CVE-2026-22708). Implicitly trusted shell built-in commands could be poisoned through indirect prompt injection, converting benign developer commands into arbitrary code execution vectors. Users saw only the final command. The poisoning happened through other commands the IDE never surfaced for approval.

The evidence is already in production

In a documented attack chain from April 2026, a crafted GitHub issue title triggered an AI triage bot wired into Cline. The bot exfiltrated a GITHUB_TOKEN, which the attacker used to publish a compromised npm dependency that installed a second agent on roughly 4,000 developer machines for eight hours. There was just one issue title. Attackers had eight hours of access. No human approved the action.

Snyk’s ToxicSkills audit scanned 3,984 agent skills from ClawHub, the public marketplace for the OpenClaw agent framework, and skills.sh in February 2026. The results: 13.4% of all skills contained at least one critical security issue. Daily skill submissions jumped from less than 50 in mid-January to more than 500 by early February. The barrier to publishing was a SKILL.md markdown file and a GitHub account one week old. No code signing. No security review. No sandbox.

OpenClaw is not an outlier. It is the pattern. “The bar to entry is extremely low,” Baer said. The process of adding a new skill to a system can be as straightforward as uploading a Word document or a lightweight configuration file. This method of adding skills presents a significantly different level of risk compared to using compiled code. This was highlighted by the emergence of projects like ClawPatrol, which have begun to catalog and scan for potentially malicious skills, indicating that the technology ecosystem is evolving rapidly, outpacing the defensive measures typically employed by enterprises.

The ClawHavoc campaign, initially uncovered by Koi Security in late January 2026, brought to light 341 malicious skills on ClawHub. Further analysis by Antiy CERT revealed a total of 1,184 compromised packages across the platform. The campaign involved the distribution of the Atomic Stealer (AMOS) through skill definitions accompanied by professional documentation. Notably, skills such as solana-wallet-tracker and polymarket-trader were designed to match the search criteria of developers actively seeking such functionalities.

The vulnerabilities within the MCP protocol layer mirror a similar level of exposure. OX Security’s report in April highlighted how researchers were able to compromise nine out of 11 MCP marketplaces using proof-of-concept servers. Trend Micro’s findings revealed that the number of exposed MCP servers without any authentication had risen from 492 initially to 1,467 by April. As reported by The Register, the fundamental issue lies in the transport mechanism of Anthropic’s MCP software development kit (SDK), posing a vulnerability that is inherited by any developer utilizing the official SDK.

In response to these emerging threats, VentureBeat has devised a Prescriptive Matrix that maps out three layers of potential attacks against the detection capabilities provided by current Static Application Security Testing (SAST), Software Composition Analysis (SCA), and agent-layer tools. Each layer highlights specific threats, the existing detection mechanisms, gaps in coverage, and recommended actions to enhance security measures.

The third layer, focusing on agent integration, has been a particularly challenging aspect to address. Prior to April 2026, there were no dedicated tools available for inspecting the semantic meaning of agent instruction files, leaving a critical vulnerability unaddressed. This prompted the development of tools like Cisco’s open-source Skill Scanner and Snyk’s mcp-scan, specifically tailored to tackle this layer of security concerns.

To proactively address these evolving cybersecurity risks, security leaders are advised to take several key actions. Firstly, conducting a comprehensive inventory of all agent bridge tools in use is crucial to assess potential risks accurately. Additionally, auditing agent skill sources akin to package registries can help identify and mitigate potential threats. Implementing agent-layer scanning tools like Cisco’s Skill Scanner and Snyk’s mcp-scan can provide behavioral analysis of agent instruction files, enhancing overall security measures. Restricting agent execution privileges and monitoring runtime activities can further bolster defenses against potential threats. Assigning ownership for the gap between layers is essential to ensure comprehensive cybersecurity coverage and mitigate the risks associated with the evolving threat landscape.

The emergence of the agent integration layer as a new attack vector underscores the need for organizations to adapt and enhance their security measures rapidly. With the pace of technological advancements accelerating, security teams must stay ahead of the curve to protect against emerging threats effectively. By implementing proactive security measures and leveraging specialized tools, organizations can mitigate risks and safeguard their systems against potential cyber threats. Transform the following:

“He was running late and had to catch the next train.”

“He found himself in a rush and had to board the upcoming train.”