Microsoft: AI-Powered Security System MDASH Tops Industry Benchmark

By Amit Chowdhry ● Yesterday at 11:12 PM

Microsoft announced a major advancement in AI-powered cybersecurity with the unveiling of its new multi-model agentic security system, codenamed MDASH, which helped researchers identify 16 previously undisclosed vulnerabilities across the Windows networking and authentication stack. The findings included four Critical remote code execution flaws affecting components such as the Windows kernel TCP/IP stack and the IKEv2 service.

Built by Microsoft’s Autonomous Code Security (ACS) team, MDASH orchestrates more than 100 specialized AI agents across multiple frontier and distilled AI models to autonomously discover, debate, validate, and prove exploitable vulnerabilities. Microsoft said the system represents a shift from experimental AI vulnerability research toward production-grade defensive security engineering at enterprise scale.

The company reported strong benchmark results for MDASH, including identifying all 21 intentionally planted vulnerabilities in a private Windows driver test environment with zero false positives. Microsoft also said the system achieved 96% recall across five years of confirmed Microsoft Security Response Center (MSRC) vulnerabilities in clfs.sys and 100% recall in tcpip.sys. On the public CyberGym benchmark of 1,507 real-world vulnerability reproduction tasks, MDASH achieved an industry-leading 88.45% success rate, roughly five percentage points ahead of the next-highest result.

Microsoft emphasized that MDASH’s advantage comes not from a single large language model, but from the orchestration layer around the models. The system combines specialized auditor, debater, prover, and validation agents into a structured pipeline designed to scale vulnerability research and remediation workflows.

The system’s architecture includes several phases:

  1. Prepare stage, where the system ingests source code, builds language-aware indexes, and maps attack surfaces and threat models.
  2. Scan stage, where specialized auditing agents identify candidate vulnerabilities.
  3. Validate stage, where separate debate-oriented agents challenge and verify findings.
  4. Dedup stage, where semantically similar findings are consolidated.
  5. Prove stage, where exploit-triggering inputs are generated and executed to validate exploitability.

Microsoft said MDASH was specifically designed for complex proprietary environments such as Windows, Hyper-V, Azure, and associated drivers and services, where reasoning about kernel conventions, trust boundaries, and concurrency models requires deeper contextual analysis than pattern matching alone.

The company disclosed that the May 2026 Patch Tuesday release included 16 CVEs discovered using MDASH, including vulnerabilities in tcpip.sys, ikeext.dll, http.sys, dnsapi.dll, netlogon.dll, and telnet.exe. Several of the flaws were remotely exploitable without authentication.

Among the highlighted vulnerabilities was CVE-2026-33827, a remote unauthenticated use-after-free flaw in tcpip.sys involving Strict Source and Record Route (SSRR) packet handling. Microsoft said the flaw required sophisticated reasoning across concurrency conditions, object lifetime management, and multi-threaded race conditions that single-model systems failed to detect.

Another highlighted issue, CVE-2026-33824, involved a double-free vulnerability in the IKEEXT service triggered through crafted IKEv2 fragmentation packets. The flaw enabled potential pre-authentication remote code execution under LocalSystem privileges on systems configured as IKEv2 responders.

Microsoft noted that the ACS team includes members of Team Atlanta, the group that won the $29.5 million DARPA AI Cyber Challenge by building autonomous systems capable of identifying and patching vulnerabilities in open-source software projects.

The company said MDASH’s plugin architecture allows security researchers and domain experts to inject environment-specific context, such as filesystem invariants, kernel calling conventions, or CodeQL analysis data, enabling more accurate reasoning and proof generation for specialized systems.

Microsoft also stressed that the framework was intentionally designed to remain model agnostic. As new AI models emerge, the company said organizations can swap models into the pipeline without rebuilding the broader validation, proofing, and orchestration infrastructure.

The system is currently being used internally by Microsoft security engineering teams and is being evaluated by a limited group of customers in a private preview program.

In discussing the broader implications of the technology, Microsoft said AI-powered vulnerability discovery has evolved into a practical engineering discipline rather than a speculative research exercise. The company argued that the future competitive advantage in AI security will depend less on individual models and more on the surrounding orchestration, validation, and proof systems that convert raw AI outputs into actionable, production-grade security findings.

Exit mobile version