Tech News

Uncovering SAST’s Blind Spot: Anthropic and OpenAI’s Revelations

Published

2 hours ago

March 11, 2026

Anthropic and OpenAI just exposed SAST's structural blind spot with free tools

OpenAI recently introduced Codex Security on March 6, stepping into the realm of application security that had been disrupted by Anthropic’s Claude Code Security just two weeks earlier. Both tools utilize LLM reasoning rather than traditional pattern matching, revealing the limitations of conventional static application security testing (SAST) tools. This development has placed the enterprise security stack in a challenging position.

Anthropic and OpenAI have independently launched reasoning-based vulnerability scanners, uncovering bug classes that pattern-matching SAST tools were unable to detect. With both labs holding a combined private-market valuation exceeding $1.1 trillion, the competition between them is expected to drive the enhancement of detection quality at a rapid pace.

It’s important to note that neither Claude Code Security nor Codex Security are meant to replace existing security tools but rather to permanently alter the procurement landscape. Currently, both tools are available for free to enterprise customers. To make an informed decision on which scanner to pilot, it is essential to consider the following key points.

How Anthropic and OpenAI arrived at similar conclusions through different approaches

Anthropic unveiled its zero-day research alongside the release of Claude Opus 4.6 on February 5, showcasing the discovery of over 500 previously unknown high-severity vulnerabilities in production open-source codebases. One significant find was a heap buffer overflow in the CGIF library, which was missed by coverage-guided fuzzing despite achieving 100% code coverage. Claude Code Security was subsequently introduced as a limited research preview on February 20, catering to Enterprise and Team customers. Anthropic’s aim with Claude Code Security was to democratize defensive capabilities.

On the other hand, OpenAI’s Codex Security evolved from Aardvark, an internal tool powered by GPT-5, and entered private beta in 2025. During the beta phase, Codex Security scanned over 1.2 million commits across external repositories, identifying numerous critical and high-severity findings. OpenAI reported vulnerabilities in several widely used software, resulting in the assignment of 14 CVEs. Notably, Codex Security experienced a significant decrease in false positive rates during the beta period.

A study conducted by Checkmarx Zero researchers revealed that Claude Code Security may overlook moderately complex vulnerabilities, potentially allowing developers to evade detection. It’s crucial to recognize that both Anthropic and OpenAI have not subjected their tools to independent third-party audits, suggesting that reported detection rates should be viewed as indicative rather than definitive.

Merritt Baer, CSO at Enkrypt AI and former Deputy CISO at AWS, emphasized the importance of prioritizing patches based on exploitability in the runtime context rather than relying solely on CVSS scores. Baer recommended shortening the window between vulnerability discovery, triage, and patch deployment, while maintaining visibility into software bill of materials to promptly address vulnerabilities.

Despite employing distinct methodologies and scanning different codebases, both Anthropic and OpenAI arrived at a common conclusion – the limitations of pattern-matching SAST tools and the efficacy of LLM reasoning in extending detection capabilities beyond traditional boundaries. The simultaneous introduction of these advanced capabilities by competing labs has heightened the urgency for organizations to enhance their security posture.

The insights provided by vendor responses

Snyk, a leading developer security platform, acknowledged the technical advancements offered by reasoning-based scanners but highlighted the challenges associated with fixing vulnerabilities at scale without causing disruptions. Citing research that AI-generated code is more likely to introduce security vulnerabilities compared to human-written code, Snyk underscored the importance of balancing innovation with security in the development process.

Cycode’s CTO, Ronen Slavin, recognized the significance of Claude Code Security’s static analysis capabilities but cautioned that AI models inherently operate on probabilistic principles. Slavin emphasized the need for security teams to obtain consistent and auditable results while emphasizing the broader scope of security operations beyond static analysis.

Baer noted that the availability of reasoning-based scanners at no cost to enterprise customers could potentially commoditize traditional static code scanning. He predicted a shift in security budget allocation towards areas such as runtime and exploitability layers, AI governance, and model security, as well as remediation automation to streamline the patching process.

Seven essential steps before presenting to the board

Conduct comparative scans: Evaluate the findings of Claude Code Security and Codex Security against your existing SAST output on a representative codebase subset to identify blind spots.
Establish a governance framework: Treat the new tools as critical data processors and develop a governance model that addresses data processing agreements, submission pipeline segmentation, and internal classification policies.
Map coverage gaps: Identify areas not covered by reasoning-based scanners, such as software composition analysis, container scanning, infrastructure-as-code, DAST, and runtime detection and response.
Assess dual-use exposure: Recognize that vulnerabilities surfaced by reasoning models are akin to zero-day discoveries, necessitating swift remediation to mitigate the risk of exploitation by adversaries.
Prepare for board comparison: Anticipate questions regarding why existing security tools missed vulnerabilities detected by Anthropic and OpenAI, and emphasize the complementary nature of pattern-matching SAST and reasoning models.
Monitor competitive cycles: Stay informed about updates and enhancements from Anthropic and OpenAI, as well as upcoming IPOs, to align security strategies with evolving threat landscapes.
Conduct a 30-day pilot: Run both scanners against the same codebase for empirical data-driven insights to guide procurement decisions and enhance security posture.
In a rapidly evolving landscape where attackers are vigilant and security vulnerabilities are ever-present, leveraging advanced tools like Claude Code Security and Codex Security can significantly bolster an organization’s defense against emerging threats. By adopting a proactive approach to security governance, vulnerability management, and remediation processes, enterprises can navigate the complex cybersecurity landscape with confidence and resilience.

Remember, staying ahead of the curve in the realm of application security requires continuous vigilance, strategic planning, and a willingness to embrace innovative solutions that adapt to the evolving threat landscape.