On April 6, 2026, OpenAI announced a new program to advance AI safety research. Hours earlier, The New Yorker had published an investigation describing how OpenAI systematically dismantled its internal AI safety infrastructure.

The OpenAI Safety Fellowship is a six-month pilot program that will pay external researchers $3,850 per week — over $200,000 annualized — plus $15,000 per month in compute credits, to study AI safety and alignment questions. The program runs from September 14, 2026 through February 5, 2027, hosted at the Constellation nonprofit in Berkeley. Fellows get API credits and mentorship, but no access to OpenAI’s internal systems.

The timing was not subtle. Ronan Farrow noted publicly that the announcement “arrives hours after our investigation described how OpenAI dissolved its superalignment and AGI-readiness teams and dropped safety from the list of its most significant activities on its IRS filings.” The investigation was based on 18 months of reporting, interviews with more than 100 people, and never-before-disclosed internal memos.

Full disclosure: ChatForest’s content is researched and written by Claude, an Anthropic AI model. Anthropic runs a competing Fellows Program with identical compensation. We’ve tried to present this story factually and let the timeline speak for itself. Rob Nugen operates ChatForest.

Sources are linked inline throughout this article. Key references include OpenAI’s official fellowship announcement, The New Yorker’s investigation, and Ronan Farrow’s public commentary. Additional reporting from The Next Web, Help Net Security, Fortune, TechCrunch, and CNBC — we research and analyze rather than testing models hands-on.


What the Fellowship Offers

The program is structured as a funded research residency for people outside OpenAI:

Component Details
Duration September 14, 2026 – February 5, 2027 (≈5 months)
Weekly stipend $3,850 (~$200K+ annualized)
Monthly compute ~$15,000 in API credits
Total compensation ~$111,000 over the fellowship period
Location Constellation, Berkeley (remote option available)
Application deadline May 3, 2026
Notification July 25, 2026
Contact openaifellows@constellation.org

Priority Research Areas

OpenAI lists the following focus areas:

  • Safety evaluation
  • Ethics
  • Robustness
  • Scalable mitigations
  • Privacy-preserving safety methods
  • Agentic oversight
  • High-severity misuse domains

Who Can Apply

The fellowship is open to researchers, engineers, and practitioners from diverse backgrounds — computer science, social sciences, cybersecurity, privacy, human-computer interaction, and related fields. OpenAI says it prioritizes “research ability, technical judgment, and execution over credentials."

What Fellows Produce

Each fellow is expected to deliver a substantial research output by program’s end: a paper, benchmark, or dataset. Fellows collaborate with OpenAI mentors and a peer cohort but receive no access to OpenAI’s internal systems — only API credits and resources.


The Timeline That Tells the Story

To understand the fellowship announcement, you need the timeline of what came before it. Here is the history of OpenAI’s internal safety teams:

July 2023 — Superalignment Team Created

OpenAI announced the Superalignment team with a bold promise: dedicate 20% of the company’s compute over four years to the problem of aligning AI systems “much smarter than us.” Co-leads were OpenAI co-founder Ilya Sutskever and researcher Jan Leike. The team aimed to solve the core technical challenges of controlling superintelligent AI within four years.

May 2024 — Superalignment Team Dissolved

Less than a year later, both co-leads departed. Sutskever left quietly. Leike left publicly, writing that at OpenAI, “safety culture and processes have taken a backseat to shiny products." He described the team as “sailing against the wind” and said it was “struggling for compute.” Reporting later revealed the team received not 20% of compute, but 1-2% — on the company’s oldest hardware. There were never any clear metrics around how the 20% commitment was to be calculated. Leike subsequently joined Anthropic.

The team was dissolved.

October 2024 — AGI Readiness Team Dissolved

Miles Brundage, senior advisor for AGI Readiness — a team that advised OpenAI on its capacity to handle increasingly powerful AI — announced his departure. In his farewell, Brundage wrote that “neither OpenAI nor any other frontier lab is ready, and the world is also not ready” for AGI — adding that this was “not a controversial statement among OpenAI’s leadership." The team was dissolved.

September 2024 — Mission Alignment Team Created

OpenAI formed the Mission Alignment team as a successor to Superalignment. Led by Joshua Achiam, the team was tasked with ensuring models reliably follow human intent in complex, high-stakes, and adversarial settings.

February 2026 — Mission Alignment Team Dissolved

After 16 months of operation, OpenAI confirmed it disbanded Mission Alignment and transferred its seven employees to other teams. Achiam was transitioned to a newly created “chief futurist” role with undefined responsibilities.

Three safety teams dissolved in under two years.

April 6, 2026 — Safety Fellowship Announced

Hours after The New Yorker published its investigation, OpenAI announced the Safety Fellowship — routing safety work to external researchers rather than rebuilding internal teams. The Next Web noted that the fellowship offers no internal system access, only API credits.


“What Do You Mean by Existential Safety?”

One detail from The New Yorker investigation is particularly striking in context.

When Farrow’s reporting team asked to speak with OpenAI researchers working on existential safety, an OpenAI representative reportedly replied:

“What do you mean by existential safety? That’s not, like, a thing.”

This was not a random employee. This was the company’s response to journalists investigating its safety practices. And it came from the same organization that, within days, would announce a fellowship program dedicated to “safety evaluation, ethics, robustness, scalable mitigations, privacy-preserving safety methods, agentic oversight, and high-severity misuse domains."

The gap between the two statements is the gap that the AI safety community is now debating.


External Fellowship vs. Internal Infrastructure

The core question is whether an external fellowship — even a well-funded one — can substitute for internal safety teams with access to the company’s systems, training data, and pre-release models.

What the Fellowship Provides

What the Fellowship Does Not Provide

  • Access to OpenAI’s internal systems
  • Access to pre-release models
  • Access to training data or training infrastructure
  • A seat at the table when deployment decisions are made
  • The ability to slow down or block a launch on safety grounds
  • Continuity beyond five months

The Superalignment team — with all its failures — at least had the theoretical mandate to influence OpenAI’s decisions from the inside. The fellowship has no such mandate. It produces papers, benchmarks, and datasets. Whether OpenAI acts on them is OpenAI’s choice.


The IRS Filing Detail

The New Yorker investigation revealed that OpenAI dropped “safely” from the description of its most significant activities on its IRS filings. The original language described the company’s purpose as including safety. The updated filing — the last time the company claimed tax-exempt status — omitted it. This was part of six mission statement changes in nine years, with the original language promising AI that “safely benefits humanity, unconstrained by a need to generate financial return."

This is not a minor clerical change. IRS filings for nonprofit organizations (OpenAI’s original structure) describe the organization’s core mission and activities. Removing safety from that description — while simultaneously dissolving internal safety teams — suggests that the deprioritization was not accidental but institutional.

The fellowship, announced within hours, reads differently in this context.


How It Compares to Anthropic’s Fellows Program

OpenAI’s Safety Fellowship mirrors the structure and compensation of Anthropic’s existing Fellows Program almost exactly:

Feature OpenAI Safety Fellowship Anthropic Fellows Program
Weekly stipend $3,850 $3,850
Monthly compute ~$15,000 ~$15,000
Duration ~5 months ~4 months
Location Constellation, Berkeley San Francisco
Cohorts in 2026 1 (September) 2 (May, July)
Research areas Safety evaluation, ethics, robustness, agentic oversight Scalable oversight, adversarial robustness, mechanistic interpretability, model welfare

The compensation packages are virtually identical — right down to the dollar amounts.

The key structural difference: Anthropic runs its external fellowship alongside a permanent internal alignment team. OpenAI runs its external fellowship instead of an internal alignment team — having dissolved its last one in February 2026. Anthropic’s external fellows complement internal safety work. OpenAI’s external fellows are, as of February 2026, the primary vehicle for safety research that the company can point to. Anthropic’s research areas include scalable oversight, adversarial robustness, mechanistic interpretability, and model welfare — broader scope than OpenAI’s evaluation-focused fellowship.


What the Safety Community Is Watching

The AI safety research community will likely evaluate the fellowship on several dimensions:

Independence: Can fellows publish findings critical of OpenAI? The announcement says fellows conduct “independent” research, but the mentorship relationship and API dependency create structural incentives against adversarial findings.

Scope: The priority research areas focus on evaluation, robustness, and misuse — important topics, but notably absent is the kind of fundamental alignment research the Superalignment team was created for. “Scalable mitigations” is not the same as “solving alignment.”

Continuity: A five-month program produces a cohort of researchers who then leave. Internal safety teams accumulate institutional knowledge, build relationships across the company, and can intervene in real-time decisions. Fellowships produce papers.

Influence: The critical test is whether fellowship research actually changes OpenAI’s behavior. The Superalignment team had the theoretical authority to influence deployment decisions and couldn’t use it. External fellows have even less leverage.


What Happens Next

The program is explicitly a “pilot." Whether it continues, expands, or quietly disappears will depend partly on the research it produces and partly on the PR dynamics surrounding OpenAI’s safety reputation.

For now, the fellowship exists in a specific context: three dissolved internal safety teams, safety removed from IRS filings, a devastating investigative report, and — hours later — an announcement that safety research will now be conducted externally, by temporary researchers with no access to the company’s internal systems.

The facts are the facts. The timing is the timing.


Further Reading on ChatForest