Project Glasswing:当 AI 能自主挖出所有零日漏洞,安全行业将走向何方?
Project Glasswing: When AI Can Autonomously Find Every Zero-Day, Where Does Security Go?
> 📌 TL;DR
> Anthropic 的 Claude Mythos Preview 在未经专门安全训练的情况下,自主发现了数千个零日漏洞——覆盖所有主流操作系统和浏览器。这不是一次普通的模型升级,而是攻防格局的根本性转变。Project Glasswing 联盟试图让这种能力只为防守方所用,但白宫的阻挠和不到 1% 的补丁率揭示了一个残酷现实:我们可能永远无法以 AI 的速度修补漏洞。
发生了什么?
2026 年 4 月 7 日,Anthropic 宣布了一件改变网络安全格局的事:他们的新模型 Claude Mythos Preview 在内部测试中展现出了惊人的漏洞发现能力——数千个零日漏洞,遍布每个主流操作系统和每个主流浏览器。
最令人震惊的不是数量,而是方式:
- 一个 27 年的 OpenBSD 漏洞被发现并修补(OpenBSD 以安全性著称)
- 一个 17 年的 FreeBSD 远程代码执行漏洞(CVE-2026-4747)被完全自主发现和利用——从发现到编写 exploit,全程无人参与
- 一次浏览器攻击中,模型串联了 4 个漏洞,突破了渲染器沙箱和操作系统沙箱
Anthropic 强调:「我们没有刻意训练 Mythos 拥有这些能力。它们是代码理解、推理和自主性全面提升的下游结果。」
数据说话:能力有多强?
| 指标 | Claude Mythos Preview | Claude Opus 4.6 | GPT-5.5 |
|------|----------------------|-----------------|---------|
| 网络安全基准分数 | 83.1% | 66.6% | 71.4% |
| 专家级 CTF 成功率 | 73% | — | — |
| exploit 生成成功率 | ~100x Opus 4.6 | 基准 | — |
| "The Last Ones" 32步模拟 | 完成 3/10 次,平均 22 步 | 平均 16 步 | — |
英国 AI 安全研究所(AISI)的"The Last Ones"测试是一个 32 步的企业网络攻击模拟,人类专家需要约 20 小时才能完成。Mythos 是第一个从头到尾完成它的模型。
Project Glasswing:防守联盟
面对这种双刃剑般的能力,Anthropic 没有选择公开发布模型。相反,他们组建了 Project Glasswing——一个由全球顶级科技公司组成的防御联盟:
核心成员:AWS、Apple、Microsoft、Google、CrowdStrike、Palo Alto Networks、Cisco、Broadcom、JPMorgan Chase、Linux Foundation,外加约 40 个其他组织。
核心使命:用 Mythos 的攻击能力来找漏洞、修漏洞——在坏人拥有同等能力之前。
Anthropic 承诺投入高达 1 亿美元的 Mythos 使用额度,以及 400 万美元直接捐赠给开源安全组织。
定价(面向联盟内部)
- 输入:$25 / 百万 tokens
- 输出:$125 / 百万 tokens
- 通过 Claude API、Amazon Bedrock、Google Cloud Vertex AI、Microsoft Foundry 提供
白宫为何紧急叫停?
故事到这里本该是「AI 拯救世界」的叙事,但现实远比这复杂。
Anthropic 计划将 Mythos 的访问扩展到额外 70 个组织(总计约 120 个),白宫以两个理由叫停:
1. 国家安全:更多组织意味着更大的泄露风险(发布当天就已经有未授权用户在论坛获得了访问权限)
2. 算力瓶颈:Anthropic 没有足够的计算资源同时服务 120 个实体而不影响政府自身的使用
更深层的背景是:五角大楼此前因 Anthropic 拒绝放宽其产品在国内监控和全自主武器方面的使用限制,将其列为供应链风险。不过 5 月初的迹象显示关系正在缓和——特朗普在 CNBC 采访中称 Anthropic「正在好转」,白宫也在起草允许联邦机构使用 Anthropic 技术的指导方针。
欧洲的焦虑
如果说白宫是在控制 Mythos 的扩散,欧洲则是在焦虑被排除在外。
欧元区财长在布鲁塞尔召开会议,要求获得 Mythos 的访问权。目前的现实是:没有任何欧洲政府或银行有权限使用这个模型。这意味着欧洲的关键基础设施存在漏洞——这些漏洞 Mythos 能找到,但欧洲无法用它来修。
对企业意味着什么?
坏消息:漏洞洪水即将来临
Mythos 发现的漏洞会通过 CVE 系统向下游流动。当 Linux 内核、OpenSSL 或任何广泛使用的开源库被发现新漏洞时,每个使用该软件的组织都会收到新的关键补丁需求。
而目前的现实是:不到 1% 的 Mythos 发现的漏洞已被修补。联邦网络的补丁周期通常以周或月计算。
好消息:防御也在升级
- 没有安全背景的工程师使用 Mythos,一夜之间就能生成可工作的 exploit——这意味着你的安全团队也能用同样的工具来提前发现和修复问题
- 自动化漏洞发现将从「昂贵的专家手工活」变成「可规模化的 AI 工作流」
行动建议
| 优先级 | 行动 |
|--------|------|
| 立即 | 审计你的补丁流程——能否在 72 小时内部署关键补丁? |
| 短期 | 评估 AI 辅助安全工具(不限于 Mythos,GPT-5.5 在安全基准上也达到 71.4%) |
| 中期 | 建立 AI agent 治理框架——当你部署能自主分析代码的 AI 时,它需要权限边界和审计日志 |
| 长期 | 接受「零漏洞」是幻想,转向「假设已被突破」的韧性架构 |
六个月窗口期
Anthropic 自己估计,类似的能力将在 6-18 个月内从其他 AI 实验室扩散。OpenAI 据报正在开发具备可比攻击能力的模型。
这意味着防御方的准备窗口正在快速关闭。Project Glasswing 本质上是一场赛跑——在攻击方获得同等能力之前,尽可能多地修补关键基础设施。
但不到 1% 的补丁率告诉我们:这场赛跑,防守方可能跑不赢。
真正的问题
Project Glasswing 暴露了一个更深层的矛盾:
> 能破坏一切的东西,也是能修复一切的东西。
当同一个模型既能找到漏洞又能编写 exploit,我们如何确保它只被「好人」使用?Anthropic 选择了「限制访问 + 联盟制」的路径,但发布当天的未授权访问事件证明:控制访问比控制模型本身更难。
网络安全的未来不再是「人类黑客 vs 人类防御者」,而是 AI 攻击速度 vs AI 修补速度的竞赛。而这场竞赛的规则,还在被书写中。
> ✨ 关键洞察
> 「我们永远无法以足够快的速度打补丁」不再是悲观主义者的口号——它是 Anthropic 自己承认的现实。企业安全策略的重心必须从「修补所有漏洞」转向「在被突破时仍能运转」的韧性设计。
---
数据来源:Anthropic 官方公告(2026-04-07)、UK AISI 评估报告、The Hacker News、Bloomberg、The Next Web。所有基准数据截至 2026 年 5 月。
> 📌 TL;DR
> Anthropic's Claude Mythos Preview autonomously discovered thousands of zero-day vulnerabilities across every major OS and browser—without being specifically trained for security. This isn't just a model upgrade; it's a fundamental shift in the attack-defense landscape. Project Glasswing aims to keep this power in defenders' hands, but the White House's intervention and a sub-1% patch rate reveal a harsh truth: we may never patch fast enough to keep up with AI.
What Happened?
On April 7, 2026, Anthropic announced something that fundamentally changed the cybersecurity landscape: their new model, Claude Mythos Preview, demonstrated extraordinary vulnerability discovery capabilities during internal testing—thousands of zero-day vulnerabilities across every major operating system and every major browser.
The most shocking part isn't the quantity, but the method:
- A 27-year-old OpenBSD bug was discovered and patched (OpenBSD is known primarily for its security)
- A 17-year-old FreeBSD remote code execution vulnerability (CVE-2026-4747) was fully autonomously discovered and exploited—zero human involvement from discovery to working exploit
- In one browser attack, the model chained 4 vulnerabilities together, escaping both the renderer sandbox and the OS sandbox
Anthropic emphasized: "We did not explicitly train Mythos to have these capabilities. They emerged as a downstream consequence of general improvements in code, reasoning, and autonomy."
The Numbers: How Capable Is It?
| Metric | Claude Mythos Preview | Claude Opus 4.6 | GPT-5.5 |
|--------|----------------------|-----------------|---------|
| Cybersecurity benchmark score | 83.1% | 66.6% | 71.4% |
| Expert-level CTF success rate | 73% | — | — |
| Exploit generation success | ~100x Opus 4.6 | Baseline | — |
| "The Last Ones" 32-step sim | Completed 3/10, avg 22 steps | Avg 16 steps | — |
The UK AI Safety Institute's "The Last Ones" test is a 32-step corporate network attack simulation that takes a human expert approximately 20 hours to complete. Mythos is the first model to finish it end-to-end.
Project Glasswing: The Defense Coalition
Faced with this double-edged capability, Anthropic chose not to release the model publicly. Instead, they formed Project Glasswing—a defensive coalition of the world's top tech companies:
Core members: AWS, Apple, Microsoft, Google, CrowdStrike, Palo Alto Networks, Cisco, Broadcom, JPMorgan Chase, Linux Foundation, plus roughly 40 other organizations.
Core mission: Use Mythos's offensive capabilities to find and fix vulnerabilities—before bad actors gain equivalent power.
Anthropic is committing up to $100 million in Mythos usage credits, plus $4 million in direct donations to open-source security organizations.
Pricing (for coalition members)
- Input: $25 / million tokens
- Output: $125 / million tokens
- Available via Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry
Why the White House Hit the Brakes
This should have been a straightforward "AI saves the world" narrative. Reality is messier.
Anthropic planned to expand Mythos access to 70 additional organizations (totaling ~120). The White House blocked it for two reasons:
1. National security: More organizations = higher leak risk (unauthorized users gained access on launch day via a private forum)
2. Compute constraints: Anthropic lacks sufficient compute to serve 120 entities without degrading the government's own access
Deeper context: The Pentagon previously labeled Anthropic a supply chain risk after the company refused to relax restrictions on domestic surveillance and fully autonomous weapons use. However, signs of thawing emerged in May—Trump called Anthropic "shaping up" in a CNBC interview, and the White House is drafting guidance to allow federal agencies to use Anthropic technology.
Europe's Anxiety
While the White House tries to control Mythos's spread, Europe is anxious about being locked out entirely.
Euro-area finance ministers met in Brussels demanding Mythos access. The current reality: no European government or bank has access to this model. This means European critical infrastructure has vulnerabilities that Mythos can find—but Europe can't use it to fix them.
What This Means for Enterprises
Bad News: A Vulnerability Flood Is Coming
Vulnerabilities discovered by Mythos will flow downstream through the CVE system. When the Linux kernel, OpenSSL, or any widely-used open-source library gets a new critical finding, every organization running that software gets a new critical patch requirement.
Current reality: Less than 1% of Mythos-discovered vulnerabilities have been patched. Federal network patching cycles are routinely measured in weeks or months.
Good News: Defense Is Upgrading Too
- Engineers without security backgrounds generated working exploits overnight using Mythos—meaning your security team can use the same tools to find and fix issues proactively
- Automated vulnerability discovery is shifting from "expensive expert manual work" to "scalable AI workflow"
Action Items
| Priority | Action |
|----------|--------|
| Immediate | Audit your patch pipeline—can you deploy critical patches within 72 hours? |
| Short-term | Evaluate AI-assisted security tools (not just Mythos; GPT-5.5 scores 71.4% on security benchmarks too) |
| Medium-term | Establish AI agent governance—when you deploy AI that can autonomously analyze code, it needs permission boundaries and audit logs |
| Long-term | Accept that "zero vulnerabilities" is a fantasy; shift toward resilience architectures that assume breach |
The Six-Month Window
Anthropic itself estimates similar capabilities will proliferate from other labs within 6-18 months. OpenAI is reportedly developing a model with comparable offensive capabilities.
This means the window for defenders to prepare is closing fast. Project Glasswing is fundamentally a race—patch as much critical infrastructure as possible before attackers gain equivalent power.
But the sub-1% patch rate tells us: defenders might not win this race.
The Real Question
Project Glasswing exposes a deeper paradox:
> The thing that can break everything is also the thing that fixes everything.
When the same model can both find vulnerabilities and write exploits, how do we ensure it's only used by "good guys"? Anthropic chose "restricted access + coalition model," but the launch-day unauthorized access incident proves: controlling access is harder than controlling the model itself.
The future of cybersecurity is no longer "human hackers vs. human defenders"—it's AI attack speed vs. AI patch speed. And the rules of that race are still being written.
> ✨ Key Insight
> "We will never patch fast enough" is no longer a pessimist's slogan—it's a reality Anthropic itself acknowledges. Enterprise security strategy must shift from "fix all vulnerabilities" to resilience design that assumes breach and maintains operations regardless.
---
Sources: Anthropic official announcement (2026-04-07), UK AISI evaluation report, The Hacker News, Bloomberg, The Next Web. All benchmark data as of May 2026.