Tech News

Browser Agent Hijacking: Understanding the 31.5% Vulnerability Rate

Published

2 months ago

June 2, 2026

Anthropic’s browser agent got hijacked 31.5% of the time before safeguards engaged

Frontier labs have recently published their prompt injection figures, with Anthropic leading the pack. When a red-teamer was directed to their newest model, it was hijacked 31.5% of the time before safeguards kicked in. This figure stands out as a potential liability, but in reality, it is a crucial piece of information for comparison.

Comparing prompt injection disclosures from four different labs reveals significant discrepancies. Anthropic disclosed detailed information in a 244-page document, while OpenAI reported on only one surface – connectors. Google and Meta, on the other hand, did not provide specific numbers for comparison, making it challenging for security leaders to assess the risks.

Prompt injection involves hiding malicious instructions within something an agent reads, such as a web page or a document. This method can lead to unauthorized actions or data breaches. The lack of industry standards for measuring prompt injection poses a significant challenge, as each lab uses its own metrics, making comparisons difficult.

According to experts like Carter Rees from Reputation and Adam Meyers from CrowdStrike, prompt injection poses a serious threat that organizations must manage. As AI implementation increases the attack surface, protecting AI models against misuse becomes crucial to prevent data breaches and other malicious activities.

Anthropic’s Opus 4.8 card stands out for breaking down prompt injection figures by surface. The results vary significantly depending on the environment, highlighting the importance of assessing risks across different scenarios.

In contrast, OpenAI’s GPT-5.5 card focuses on one surface and known attacks, providing a robustness score rather than detailed attack success rates. Google and Meta take different approaches to prompt injection defense, with Google emphasizing resistance without specific numbers, and Meta using a separate stack for protection.

The Cross-Vendor Prompt Injection Disclosure Grid highlights the differences in prompt injection testing among the four labs. Each lab tested different surfaces and provided varying levels of detail, making it challenging for security teams to make informed decisions.

To address these challenges, security teams should evaluate each agent based on the surface it interacts with, demand detailed attack success rates from vendors, confirm integration specifics, and conduct their own injection tests before deployment. While there is no industry standard for prompt injection testing yet, organizations can use these strategies to mitigate risks effectively.

In conclusion, prompt injection testing is a critical aspect of AI security that requires careful evaluation and monitoring. By understanding the nuances of prompt injection disclosures and taking proactive measures to assess and mitigate risks, organizations can protect their AI models from potential threats.