Connect with us

Microsoft

AI Advancements: Microsoft’s Multi-Agent System Dominates Cybersecurity Benchmark

Published

on

Microsoft's multi-agent AI system tops Anthropic's Mythos on cybersecurity benchmark – GeekWire

Microsoft’s MDASH AI System Surpasses Rival on Cybersecurity Benchmark


CyberGym benchmark scores over time, showing the rapid improvement in AI vulnerability discovery capabilities. Microsoft’s multi-model MDASH system (top right) tops the leaderboard at 88.4%. (CyberGym / UC Berkeley)

Mythos has been MDASH’d.

Microsoft has introduced a new AI-powered system called MDASH that has outperformed a rival system from Anthropic on a major cybersecurity benchmark. MDASH utilizes over 100 specialized AI agents working across multiple models to detect real-world software vulnerabilities.

The MDASH system by Microsoft uncovered 16 new vulnerabilities in various Windows versions, including four critical remote code execution flaws fixed in the latest Patch Tuesday release. This move comes amid ongoing security concerns faced by the tech giant.

MDASH, short for “multi-model agentic scanning harness,” employs a unique approach by running specialized AI agents through a staged pipeline. These agents scan code for vulnerabilities, analyze findings, and create proof-of-concept attacks to confirm the existence of bugs.

In contrast, Anthropic’s Mythos, a single AI model within an agent framework, raised doubts over its ability to find and exploit software vulnerabilities. Anthropic released Mythos to a limited group of companies through Project Glasswing.

MDASH achieved an impressive score of 88.45% on the CyberGym benchmark, developed by UC Berkeley researchers. The benchmark assesses how well AI systems can replicate real-world vulnerabilities across tasks drawn from various open-source software projects.

Microsoft’s MDASH surpassed competitors like Anthropic’s Mythos and OpenAI’s GPT-5.5 on the benchmark leaderboard, showcasing the system’s efficiency in vulnerability discovery.

While the benchmark results are self-reported by the companies, the concern over AI’s potential misuse as a hacking tool is growing. Microsoft plans to use MDASH internally and with select customers to enhance security measures.

As AI accelerates vulnerability discovery, Microsoft warns of larger Patch Tuesdays in the future. The company emphasizes the importance of proactive security measures in the face of evolving cyber threats.

See also  Microsoft's Redmond Lease Renewal Boosts Eastside Office Market Amid Seattle Growth

Trending