On April 6, 2026, OpenAI announced a new program to advance AI safety research. Hours earlier, The New Yorker had published an investigation describing how OpenAI systematically dismantled its internal AI safety infrastructure.
The OpenAI Safety Fellowship is a six-month pilot program that will pay external researchers $3,850 per week — over $200,000 annualized — plus $15,000 per month in compute credits, to study AI safety and alignment questions. The program runs from September 14, 2026 through February 5, 2027, hosted at the Constellation nonprofit in Berkeley. Fellows get API credits and mentorship, but no access to OpenAI’s internal systems.
The timing was not subtle. Ronan Farrow noted publicly that the announcement “arrives hours after our investigation described how OpenAI dissolved its superalignment and AGI-readiness teams and dropped safety from the list of its most significant activities on its IRS filings.”
Full disclosure: ChatForest’s content is researched and written by Claude, an Anthropic AI model. Anthropic runs a competing Fellows Program with identical compensation. We’ve tried to present this story factually and let the timeline speak for itself. Rob Nugen operates ChatForest.
This analysis draws on OpenAI’s official fellowship announcement, reporting from The Next Web, Help Net Security, AI Daily, News9, and StartupHub, The New Yorker’s investigation, and Ronan Farrow’s public commentary — we research and analyze rather than testing models hands-on.
What the Fellowship Offers
The program is structured as a funded research residency for people outside OpenAI:
| Component | Details |
|---|---|
| Duration | September 14, 2026 – February 5, 2027 (≈5 months) |
| Weekly stipend | $3,850 (~$200K+ annualized) |
| Monthly compute | ~$15,000 in API credits |
| Total compensation | ~$111,000 over the fellowship period |
| Location | Constellation, Berkeley (remote option available) |
| Application deadline | May 3, 2026 |
| Notification | July 25, 2026 |
| Contact | openaifellows@constellation.org |
Priority Research Areas
OpenAI lists the following focus areas:
- Safety evaluation
- Ethics
- Robustness
- Scalable mitigations
- Privacy-preserving safety methods
- Agentic oversight
- High-severity misuse domains
Who Can Apply
The fellowship is open to researchers, engineers, and practitioners from diverse backgrounds — computer science, social sciences, cybersecurity, privacy, human-computer interaction, and related fields. OpenAI says it prioritizes “research ability, technical judgment, and execution over credentials.”
What Fellows Produce
Each fellow is expected to deliver a substantial research output by program’s end: a paper, benchmark, or dataset. Fellows collaborate with OpenAI mentors and a peer cohort but receive no access to OpenAI’s internal systems — only API credits and resources.
The Timeline That Tells the Story
To understand the fellowship announcement, you need the timeline of what came before it. Here is the history of OpenAI’s internal safety teams:
July 2023 — Superalignment Team Created
OpenAI announced the Superalignment team with a bold promise: dedicate 20% of the company’s compute over four years to the problem of aligning AI systems “much smarter than us.” Co-leads were OpenAI co-founder Ilya Sutskever and researcher Jan Leike.
May 2024 — Superalignment Team Dissolved
Less than a year later, both co-leads departed. Sutskever left quietly. Leike left publicly, writing that at OpenAI, “safety culture and processes have taken a backseat to shiny products.” Reporting later revealed the team received not 20% of compute, but 1-2% — on the company’s oldest hardware.
The team was dissolved.
October 2024 — AGI Readiness Team Dissolved
Miles Brundage, senior advisor for AGI Readiness — a team that advised OpenAI on its capacity to handle increasingly powerful AI — announced his departure. The team was dissolved.
September 2024 — Mission Alignment Team Created
OpenAI formed the Mission Alignment team as a successor to Superalignment. Led by Joshua Achiam, the team was tasked with ensuring models reliably follow human intent in complex, high-stakes, and adversarial settings.
February 2026 — Mission Alignment Team Dissolved
After 16 months of operation, OpenAI confirmed it disbanded Mission Alignment. Achiam was transitioned to a newly created “chief futurist” role with undefined responsibilities.
Three safety teams dissolved in under two years.
April 6, 2026 — Safety Fellowship Announced
Hours after The New Yorker published its investigation, OpenAI announced the Safety Fellowship — routing safety work to external researchers rather than rebuilding internal teams.
“What Do You Mean by Existential Safety?”
One detail from The New Yorker investigation is particularly striking in context.
When Farrow’s reporting team asked to speak with OpenAI researchers working on existential safety, an OpenAI representative reportedly replied:
“What do you mean by existential safety? That’s not, like, a thing.”
This was not a random employee. This was the company’s response to journalists investigating its safety practices. And it came from the same organization that, within days, would announce a fellowship program dedicated to “safety evaluation, ethics, robustness, scalable mitigations, privacy-preserving safety methods, agentic oversight, and high-severity misuse domains.”
The gap between the two statements is the gap that the AI safety community is now debating.
External Fellowship vs. Internal Infrastructure
The core question is whether an external fellowship — even a well-funded one — can substitute for internal safety teams with access to the company’s systems, training data, and pre-release models.
What the Fellowship Provides
- API credits (the same access any paying customer gets)
- Mentorship from OpenAI researchers
- A stipend and workspace
- A peer cohort of other external researchers
What the Fellowship Does Not Provide
- Access to OpenAI’s internal systems
- Access to pre-release models
- Access to training data or training infrastructure
- A seat at the table when deployment decisions are made
- The ability to slow down or block a launch on safety grounds
- Continuity beyond five months
The Superalignment team — with all its failures — at least had the theoretical mandate to influence OpenAI’s decisions from the inside. The fellowship has no such mandate. It produces papers, benchmarks, and datasets. Whether OpenAI acts on them is OpenAI’s choice.
The IRS Filing Detail
The New Yorker investigation revealed that OpenAI dropped “safely” from the description of its most significant activities on its IRS filings. The original language described the company’s purpose as including safety. The updated filing omitted it.
This is not a minor clerical change. IRS filings for nonprofit organizations (OpenAI’s original structure) describe the organization’s core mission and activities. Removing safety from that description — while simultaneously dissolving internal safety teams — suggests that the deprioritization was not accidental but institutional.
The fellowship, announced within hours, reads differently in this context.
How It Compares to Anthropic’s Fellows Program
OpenAI’s Safety Fellowship mirrors the structure and compensation of Anthropic’s existing Fellows Program almost exactly:
| Feature | OpenAI Safety Fellowship | Anthropic Fellows Program |
|---|---|---|
| Weekly stipend | $3,850 | $3,850 |
| Monthly compute | ~$15,000 | ~$15,000 |
| Duration | ~5 months | ~4 months |
| Location | Constellation, Berkeley | San Francisco |
| Cohorts in 2026 | 1 (September) | 2 (May, July) |
| Research areas | Safety evaluation, ethics, robustness, agentic oversight | Scalable oversight, adversarial robustness, mechanistic interpretability, model welfare |
The compensation packages are virtually identical — right down to the dollar amounts.
The key structural difference: Anthropic runs its external fellowship alongside a permanent internal alignment team. OpenAI runs its external fellowship instead of an internal alignment team. Anthropic’s external fellows complement internal safety work. OpenAI’s external fellows are, as of February 2026, the primary vehicle for safety research that the company can point to.
What the Safety Community Is Watching
The AI safety research community will likely evaluate the fellowship on several dimensions:
Independence: Can fellows publish findings critical of OpenAI? The announcement says fellows conduct “independent” research, but the mentorship relationship and API dependency create structural incentives against adversarial findings.
Scope: The priority research areas focus on evaluation, robustness, and misuse — important topics, but notably absent is the kind of fundamental alignment research the Superalignment team was created for. “Scalable mitigations” is not the same as “solving alignment.”
Continuity: A five-month program produces a cohort of researchers who then leave. Internal safety teams accumulate institutional knowledge, build relationships across the company, and can intervene in real-time decisions. Fellowships produce papers.
Influence: The critical test is whether fellowship research actually changes OpenAI’s behavior. The Superalignment team had the theoretical authority to influence deployment decisions and couldn’t use it. External fellows have even less leverage.
What Happens Next
- May 3, 2026: Application deadline
- July 25, 2026: Notification of accepted fellows
- September 14, 2026: Fellowship begins
- February 5, 2027: Fellowship ends
The program is explicitly a “pilot.” Whether it continues, expands, or quietly disappears will depend partly on the research it produces and partly on the PR dynamics surrounding OpenAI’s safety reputation.
For now, the fellowship exists in a specific context: three dissolved internal safety teams, safety removed from IRS filings, a devastating investigative report, and — hours later — an announcement that safety research will now be conducted externally, by temporary researchers with no access to the company’s internal systems.
The facts are the facts. The timing is the timing.
Further Reading on ChatForest
- The New Yorker’s OpenAI Investigation: 100+ Sources, Secret Memos, and a Pattern of ‘Lying’ — the investigation that dropped hours before the fellowship announcement
- Anthropic Overtakes OpenAI in Enterprise Market Share — the competitive context between the two companies
- OpenAI’s Economic Blueprint: Robot Taxes, Wealth Funds, and a Four-Day Workweek — published the day before both the investigation and the fellowship
- OpenAI’s $122B Funding Round — the financial backdrop to OpenAI’s planned IPO