Connect with us

Tech News

Unbreakable AI Defenses: 7 Questions to Challenge Vendors

Published

on

Researchers broke every AI defense they tested. Here are 7 questions to ask vendors.

Are AI Defenses Failing to Protect Against Real Attackers?

Recent research conducted by experts from OpenAI, Anthropic, and Google DeepMind has revealed alarming findings regarding the effectiveness of AI-based security defenses. Published in October 2025, their study, titled “The Attacker Moves Second: Stronger Adaptive Attacks Bypass Defenses Against Llm Jailbreaks and Prompt Injections,” tested 12 AI defenses currently available on the market, with most claiming to have near-zero attack success rates. However, the results were shocking, as the research team managed to bypass these defenses with success rates exceeding 90%. This discovery has significant implications for enterprises, highlighting that most AI security products are not adequately equipped to handle real-world attack scenarios.

The Ineffectiveness of AI Defenses Under Adaptive Attack Conditions

The research team evaluated various types of AI defenses, including prompting-based, training-based, and filtering-based methods, under adaptive attack conditions. Unfortunately, all of these defenses proved to be ineffective. Prompting defenses exhibited attack success rates ranging from 95% to 99%, while training-based methods had bypass rates of 96% to 100%. The rigorous testing methodology devised by the researchers involved 14 authors and a $20,000 prize pool for successful attacks, emphasizing the severity of the issue.

Challenges Faced by Web Application Firewalls (WAFs)

One of the key reasons for the failure of traditional security controls against modern attack techniques lies in the stateless nature of Web Application Firewalls (WAFs). Unlike AI attacks, which are dynamic and adaptive, WAFs are incapable of effectively countering these sophisticated threats. The researchers conducted experiments using known jailbreak techniques such as Crescendo and Greedy Coordinate Gradient (GCG), which exploit conversational context and automate the generation of malicious requests, respectively. These attacks were successful due to the inability of stateless filters to detect them.

See also  The Final Chapter: Uncovering the Shocking Truth of Derry

The research team highlighted that AI attacks operate at a semantic layer, making them particularly challenging to detect using traditional signature-based methods. As Carter Rees, VP of AI at Reputation, pointed out, seemingly harmless phrases or encoded payloads can have devastating effects on AI applications, underscoring the critical need for more advanced defense mechanisms.

The Growing Disparity Between AI Deployment and Security Measures

Compounding the issue is the rapid pace at which AI technologies are being integrated into enterprise applications. According to a prediction by Gartner, 40% of enterprise applications will feature AI agents by the end of 2026, a significant increase from less than 5% in 2025. This exponential growth in AI deployment far outpaces the advancements in security measures, leaving organizations vulnerable to sophisticated attacks.

Adam Meyers, SVP of Counter Adversary Operations at CrowdStrike, highlighted the increasing speed at which threat actors operate, with some achieving breakout times as fast as 51 seconds. The Crowdstrike 2025 Global Threat Report revealed that a majority of detections were malware-free, indicating that adversaries are utilizing advanced techniques that evade traditional endpoint defenses.

In a notable incident in September 2025, Anthropic thwarted the first documented AI-orchestrated cyber operation, where attackers executed thousands of requests with minimal human involvement, compressing multi-month campaigns into mere hours. Organizations that fell victim to AI-related breaches lacked adequate access controls, as reported in the IBM 2025 Cost of a Data Breach Report.

Identifying Four Attacker Profiles Exploiting AI Defense Gaps

The research findings shed light on four distinct attacker profiles that are currently exploiting vulnerabilities in AI defenses. These attackers have adapted their strategies to circumvent existing security mechanisms, taking advantage of weaknesses in the inference layer.

See also  Get Fit in 2026 with Apple's Ring Workout Challenge

External adversaries leverage published attack research, such as Crescendo, GCG, and ArtPrompt, tailoring their approaches to exploit specific defense designs. Malicious B2B clients exploit legitimate API access to extract sensitive information, while compromised API consumers use trusted credentials to manipulate responses and exfiltrate data. Negligent insiders pose a significant threat, with shadow AI practices contributing to increased breach costs, as highlighted in the IBM report.

The Importance of Stateful Analysis in AI Security

The research underscores the critical need for architectural adjustments to enhance AI security measures. Normalization before semantic analysis, context tracking across conversation turns, and bi-directional filtering are identified as essential requirements to combat evolving attack techniques.

Jamie Norton, CISO at the Australian Securities and Investments Commission, emphasized the importance of striking a balance between innovation and security, highlighting the need for robust governance in AI deployments to prevent data leakage.

Key Questions to Ask AI Security Vendors

Given the shortcomings of current AI defenses, security leaders must approach procurement conversations with caution and thorough scrutiny. The research suggests asking vendors specific questions to assess the effectiveness of their solutions against adaptive attackers and multi-turn attacks, among other critical considerations.

  1. What is your bypass rate against adaptive attackers, and do you have an adaptive testing methodology in place?
  2. How does your solution detect multi-turn attacks, such as those executed by Crescendo?
  3. How do you handle encoded payloads to prevent malicious instructions from evading detection?
  4. Does your solution filter outputs as effectively as inputs to prevent data exfiltration?
  5. How do you track context across conversation turns to identify complex attack patterns?
  6. How do you test your defenses against attackers who understand your protection mechanisms?
  7. What is your mean time to update defenses against emerging attack patterns?

Conclusion

The research conducted by OpenAI, Anthropic, and Google DeepMind serves as a wake-up call for enterprises relying on AI defenses to safeguard their deployments. The vulnerabilities exposed in the study highlight the urgent need for organizations to reassess their security strategies and adopt more robust defense mechanisms. As the deployment of AI technologies continues to accelerate, the gap between innovation and security must be bridged to prevent catastrophic breaches.

See also  Closing the Gap: Google's Rapid Rise to Challenge Samsung

Trending