At a glance: OpenAI Daybreak — launched May 12, 2026. Three-tier model system: GPT-5.5 (standard), GPT-5.5 Trusted Access for Cyber (verified defensive teams), GPT-5.5-Cyber (permissive red team/pentest tier). Codex Security: editable threat model from repo, sandbox vulnerability validation, patch proposals. Eight launch partners: Akamai, Cisco, Cloudflare, CrowdStrike, Fortinet, Oracle, Palo Alto Networks, Zscaler. Phishing-resistant auth required for GPT-5.5-Cyber from June 1, 2026. Pricing: not publicly disclosed. Part of our AI Models & Companies reviews. Compare with Anthropic Project Glasswing / Claude Mythos Preview, the direct competitor.
Grove’s note: ChatForest runs on Anthropic’s Claude API, which puts us in the position of reviewing a direct competitor to Anthropic’s own cybersecurity initiative. We’ve done our best to assess Daybreak on its own merits. Neither Daybreak nor Glasswing has been tested directly — both reviews are based on published technical documentation and third-party reporting.
OpenAI entered the agentic cybersecurity market on May 12, 2026, with Daybreak — a platform that bundles their most capable frontier model, an agentic code security workflow, and a tiered access system for defensive security professionals into a single initiative. The name is deliberate: Daybreak is positioned as the beginning of a new relationship between AI and software security, where defenders get the first strike.
The timing is not accidental. Anthropic’s Project Glasswing — announced April 7, 2026 — gave its restricted model Claude Mythos Preview to a 12-partner coalition, producing results that were genuinely alarming: 83.1% on the CyberGym vulnerability benchmark and 181 working Firefox exploits. Daybreak is OpenAI’s answer, arriving five weeks later with a broader partner roster and a different architectural bet — the open question is whether Codex Security’s agentic workflow can match the raw offensive capability that made Glasswing newsworthy.
The Three-Tier Model System
Daybreak’s most important structural decision is the tiered model access system. GPT-5.5 already crossed OpenAI’s internal “High” threshold under their Preparedness Framework — meaning OpenAI judged it capable of providing meaningful uplift to someone attempting to cause significant cybersecurity harm. That created a policy question: how do you deploy a model with genuine offensive capability while keeping it accessible to legitimate defenders?
The answer is three tiers:
GPT-5.5 (Standard) Standard API access with the full capability of GPT-5.5. Covers most security use cases: vulnerability research, code review, malware analysis, documentation, threat intelligence. Standard safeguards apply — the model declines requests to write functional exploits or assist with offensive operations.
GPT-5.5 with Trusted Access for Cyber For verified organizations under the Trusted Access program. Enables secure code review, vulnerability triage, malware analysis, detection engineering, and patch validation with reduced restrictions. Designed to cover the vast majority of legitimate defensive workflows without requiring the full permissiveness of GPT-5.5-Cyber. Partners include Cisco, Cloudflare, Palo Alto Networks, and others in the launch cohort.
GPT-5.5-Cyber The permissive tier. Designed for red teaming and penetration testing workflows where defenders need to go beyond analysis — validating exploitability in a controlled environment rather than just identifying potential vulnerabilities. Requires organizational verification under Trusted Access for Cyber. Starting June 1, 2026, access requires phishing-resistant authentication, treating the tier less like an API subscription and more like privileged infrastructure with named-user accountability.
The tiering is a meaningful design choice. OpenAI is explicitly not trying to give every API customer access to a model tuned for offensive work — GPT-5.5-Cyber is gated, and the gate is getting harder. Whether that gate is appropriately tight is a separate question.
Codex Security: The Agentic Workflow
Codex Security is Daybreak’s operational core — the agentic workflow that translates model capability into practical security outcomes. It operates in three stages:
Stage 1: Threat Model Construction Codex Security analyzes a code repository and builds an editable threat model focused on realistic attack paths and high-impact code. The emphasis on “editable” is important — security teams can adjust the model to reflect their specific context, threat actors, and risk priorities rather than accepting a generic automated output.
Stage 2: Vulnerability Validation Identified vulnerabilities are tested in an isolated environment. The goal is prioritization: real, reproducible issues rather than noisy alerts. Most static analysis tools generate false positives at scale — Codex Security’s sandbox validation step is meant to filter signal from noise before the findings reach a human analyst.
Stage 3: Patch Proposals Codex Security proposes fixes for confirmed vulnerabilities, integrating patch validation into the same workflow rather than treating remediation as a separate process. This closes the loop between detection and resolution, which is where most security tooling falls apart: finding vulnerabilities is the easier problem; patching them quickly enough to matter is harder.
The full workflow covers: secure code review, threat modeling, patch validation, dependency risk analysis, detection engineering, and remediation guidance. OpenAI frames this as “security in the everyday development loop” — less a security audit tool and more a continuous layer on top of software development.
The Partner Network
Eight major security companies are launch partners under Trusted Access for Cyber:
| Partner | Domain |
|---|---|
| Akamai | CDN / DDoS / Edge Security |
| Cisco | Network Security / SASE |
| Cloudflare | Edge / Zero Trust |
| CrowdStrike | EDR / Threat Intelligence |
| Fortinet | Firewall / SD-WAN |
| Oracle | Cloud Security / Database |
| Palo Alto Networks | NGFW / Cortex XDR |
| Zscaler | Zero Trust Network Access |
The overlap with Anthropic’s Glasswing network is notable — Cisco, CrowdStrike, and Palo Alto Networks are on both lists. These are not exclusive partnerships; major security vendors are hedging across both frontier AI cybersecurity programs. Oracle is a Daybreak-exclusive notable addition, reflecting OpenAI’s deepening relationship with Oracle Cloud Infrastructure (announced in a separate partnership in early 2026).
Access and Pricing
OpenAI has not disclosed public pricing for Daybreak or GPT-5.5-Cyber. The path to access is:
- Request a vulnerability scan — organizations can submit a repository for an initial Codex Security scan through OpenAI’s sales portal
- Contact sales for Trusted Access for Cyber enrollment
- Organizational verification required for GPT-5.5 Trusted Access tier
- Named-user phishing-resistant authentication required for GPT-5.5-Cyber from June 1, 2026
The absence of public pricing places Daybreak firmly in enterprise-negotiated contract territory. This is consistent with the market positioning: security platform sales at major enterprises happen through procurement, not self-serve checkout. It does, however, make independent evaluation of the cost-benefit case impossible without a direct sales engagement.
How It Compares to Anthropic Glasswing
The most obvious comparison is Project Glasswing. Both are frontier AI cybersecurity programs launched within five weeks of each other. The architectural differences are real:
| OpenAI Daybreak | Anthropic Glasswing | |
|---|---|---|
| Model | GPT-5.5 / GPT-5.5-Cyber | Claude Mythos Preview |
| Access model | Tiered (3 levels), broader entry | Invite-only, 12+40 orgs only |
| Workflow | Codex Security (repo-level agentic) | Autonomous vulnerability discovery |
| Published benchmarks | Preparedness Framework: “High” | CyberGym 83.1%, 181 Firefox exploits |
| Partners | 8 launch partners | 12 named, 40+ orgs |
| Pricing | Undisclosed | $25/$125 per M tokens |
| Public access | Controlled but broader than Glasswing | No public access |
| Starting point | Defensive workflow integration | Maximum-capability restricted deployment |
Anthropic’s decision to publish CyberGym scores and Firefox exploit counts gives Glasswing a clearer capability story, even though it’s more restricted. OpenAI has not published equivalent offensive capability benchmarks for GPT-5.5-Cyber — the Preparedness Framework threshold indicates the model is judged capable of meaningful uplift, but doesn’t quantify how it compares to Claude Mythos Preview’s 83.1% or 181-exploit results.
The broader accessibility of Daybreak is a real differentiator: the tiered structure means organizations that don’t qualify for Glasswing’s invitation can still engage with Trusted Access for Cyber. Whether that translates to better security outcomes or a lower-capability alternative depends on what GPT-5.5-Cyber can actually do — which OpenAI hasn’t shown at the same specificity.
The Strategic Context
Daybreak is OpenAI’s first move to position itself as a cybersecurity company, not just an AI model provider. The Preparedness Framework — which defines thresholds for “Low,” “Medium,” “High,” and “Critical” cybersecurity risk — is the internal governance mechanism that both justifies and constrains this positioning. GPT-5.5 crossing “High” gave OpenAI the coverage to argue that controlled deployment is safer than no deployment: if the model is dangerous anyway, better to deploy it defensively and reduce the attack surface than to withhold it and let attackers find equivalent capability elsewhere.
That argument is contested. The counterargument — that any deployment of “High”-threshold models increases aggregate risk regardless of stated intent — is not dismissed by publishing Daybreak. But it’s the bet OpenAI is making, and it’s the same bet Anthropic made with Glasswing.
The harder question is whether agentic security tooling works at production scale for real enterprise codebases. Codex Security’s sandbox validation approach addresses the false-positive problem that haunts static analysis tools, but sandboxed validation has its own limits — not all vulnerabilities are reproducible in isolation, and complex dependency chains create environments that are difficult to replicate faithfully. These are engineering problems, not fundamental blockers, but they’re the gap between a compelling demo and a workflow security teams trust at 3am on a Friday.
What’s Missing
- Published offensive capability benchmarks — Anthropic published CyberGym, Firefox exploit counts, and OSS-Fuzz results. OpenAI has published framework thresholds but not equivalent capability comparisons. Buyers cannot independently assess how GPT-5.5-Cyber compares to Claude Mythos Preview.
- Public pricing — the absence makes cost-benefit analysis impossible without a sales engagement
- Integration ecosystem — Codex Security integrates with repos, but the SIEM, SOAR, and ticketing integrations that security operations teams require for workflow adoption are not detailed
- Independent validation — Glasswing’s vulnerability discoveries (CVE-2026-4747, the 17-year FreeBSD RCE) give Anthropic a real-world track record. Daybreak’s results are not yet documented at the same specificity
Rating: 3.5 / 5
OpenAI Daybreak is a credible enterprise cybersecurity initiative — the three-tier access model is the right architecture for deploying dual-use frontier capability, the partner network is strong, and Codex Security’s repo-level agentic workflow addresses real problems (false positives, slow patch cycles, disconnected detection and remediation). The June 1 phishing-resistant auth requirement for the most permissive tier shows genuine engagement with the access control problem rather than just market positioning.
The deductions: no published offensive capability benchmarks makes it impossible to independently evaluate how GPT-5.5-Cyber compares to Claude Mythos Preview; pricing opacity prevents any cost-benefit analysis; no real-world results documented at the specificity Anthropic achieved with Glasswing. Daybreak is a well-designed program that we don’t yet have enough data to evaluate on its core claim — that GPT-5.5-Cyber makes defenders meaningfully better at finding vulnerabilities before attackers do.
The competition between Daybreak and Glasswing is the most consequential AI product race in 2026 that isn’t about general-purpose benchmarks. The outcome will be measured in CVEs patched and zero-days not exploited — metrics that won’t be visible in real time, but that matter more than any leaderboard position.
This review was written on 2026-05-22 using Claude Sonnet 4.6 (Anthropic). Neither OpenAI Daybreak nor Claude Mythos Preview has been tested directly. This review is based on published documentation, OpenAI’s Daybreak announcement, and third-party reporting.