At a glance: Cognition AI. Product: Devin (autonomous AI software engineer). Founded: 2022. Funding: $400M at $10.2B (September 2025); in talks for new round at $25B (April 2026). ARR: $1M (Sep 2024) → $73M (Jun 2025) → $100M+ combined. Key acquisition: Windsurf / Codeium (December 2025, $250M). Notable customer: Goldman Sachs (pilot). Founders: Scott Wu, Walden Yan, Steven Hao — three IOI gold medalists. Part of our AI Models & Companies reviews.


In January 2025, researchers at Answer.AI published a careful evaluation of Devin across twenty real-world software tasks. Devin failed fourteen of them. Three succeeded. Three were unclear. The headline was obvious: the world’s first “AI software engineer” couldn’t reliably engineer software.

Nine months earlier, Devin had its first paying customer. That customer had $1 million of annualized revenue attached to it. Nine months later — the same month the Answer.AI paper came out — Devin had $73 million.

The critics were right about the benchmark. The market didn’t care.


Who Built This

Cognition AI was founded in 2022 by three people who, on paper, could have gone anywhere.

Scott Wu, CEO, won three gold medals at the International Olympiad in Informatics — placing first globally in 2014. He won MATHCOUNTS nationally in 2011. He attended Harvard, studied for two years, and dropped out. His first company, Lunchclub, was an AI networking platform he left in 2022. He is 28 years old.

Walden Yan and Steven Hao, the other two co-founders, are also IOI gold medalists. The three competed together as students. The thesis at Cognition was that the same skills that make a great competitive programmer — precise problem decomposition, systematic debugging, multi-step planning — are the skills needed to build agents that can do programming autonomously.

That’s not a generic AI bet. It’s a specific one: that the people who understand the structure of software problems at a competition level are better positioned to teach AI systems to solve those problems than people working from the outside in.


What Devin Is

Devin is not a code completion tool. It’s not a chat interface where you paste code and ask for improvements. It’s a cloud agent that runs in its own virtual machine — with a browser, a terminal, and computer-use capabilities — and executes software engineering tasks autonomously.

You hand it a task: “fix this bug,” “implement this feature,” “write tests for this module,” “set up this deployment pipeline.” Devin goes away, works on it, and comes back with a pull request. It keeps working after you close your laptop.

The target users are not solo developers who want autocomplete. They are engineering teams that want to offload defined, bounded work to an agent that doesn’t need supervision. The pitch is closer to “hire a contractor” than “buy a tool.”

This changes the economics. You’re not paying per token or per seat. You’re paying for outcomes.


The ARR Trajectory

Cognition launched Devin in March 2024 to significant attention — the demo of a coding agent handling a complete engineering task, autonomously, went viral. The initial raise was $21 million from Founders Fund.

The attention was real. Converting it to revenue took until September 2024, when annualized recurring revenue reached $1 million. The company’s first dollar of scale.

By June 2025, that figure was $73 million. Nine months. 73x growth.

This is not a number that emerges from a product with broad adoption and low prices. It’s a number from enterprise contracts — large engineering teams paying for Devin Sessions at prices that add up fast. (Teams pricing is $40 per user per month; the usage-based Sessions pricing starts at $2.25 per session and scales from there.)

In July 2025, the company acquired Windsurf — formerly Codeium, the AI coding IDE with 1M+ active users and $82M of ARR at the time of the deal. The acquisition happened quickly after Google signed a licensing deal that pulled Windsurf’s CEO and key employees to Google, leaving the company’s assets available. Cognition moved fast.

The December 2025 close of the acquisition pushed combined ARR above $100 million. Enterprise ARR was doubling quarter-over-quarter in the seven weeks after the Windsurf close.


The Enterprise Bet

The clearest signal of where Cognition thinks this market goes: Goldman Sachs.

In July 2025, Goldman announced a pilot of Devin alongside their 12,000 human developers. Goldman’s CIO described the vision as a “hybrid workforce” — a combination of human and AI engineers targeting 20% efficiency gains. At 12,000 developers, a 20% efficiency gain is equivalent to the output of roughly 14,400 people from a team of 12,000. That’s not a small number.

This is the enterprise thesis that Cognition is selling: not “AI assistance” that makes individual developers marginally faster, but “AI labor” that changes the effective size of an engineering organization.

Whether Devin delivers on that thesis in practice depends on task scope and definition — the tool works well on bounded, well-specified work and struggles on open-ended complexity. But the Goldman pilot is a real bet from a real institution, and it is exactly the category of customer Cognition built the product to serve.


The Funding History

After the initial $21M from Founders Fund, Cognition closed a $400 million round at a $10.2 billion valuation in September 2025, led by Peter Thiel’s Founders Fund, with participation from Lux Capital, 8VC, and Bain Capital Ventures.

That round closed two months after the Windsurf acquisition — a fast follow-on that signaled investor conviction in the combined company.

Now, as of April 23, 2026, Bloomberg reports Cognition is in early talks to raise hundreds of millions at a $25 billion valuation — more than double the September 2025 figure. As of this writing, the round has not been confirmed as closed.

The investor logic is not subtle. If SpaceX is willing to pay $60 billion for Cursor — which makes an IDE, not an autonomous agent — then a company with a cloud agent, 73x ARR growth, and Goldman Sachs as a customer is plausibly worth multiple times its last valuation in a compressed timeframe.


The SpaceX Catalyst

On April 21, 2026, SpaceX announced an agreement to acquire Cursor (Anysphere) for $60 billion. Cursor’s product is an IDE with an AI layer — a developer tool, not an autonomous coding agent. The announcement implied that the market for AI-enhanced software development infrastructure is large enough to justify nine-figure acquisition premiums, and it implied it loudly.

Two days later, Bloomberg reported Cognition’s $25 billion talks. Cognition noted publicly that its fundraising process had started before the Cursor news. The timing, regardless, was telling. When one company in the AI coding space gets acquired for $60 billion, every investor with exposure to the adjacent companies immediately recalibrates their models.

Cognition’s management has positioned Devin as categorically different from Cursor — a cloud agent that runs autonomously rather than a developer tool that assists in real time. That distinction is meaningful in product terms. Whether investors pay for it as a distinct premium or fold it into a general “AI coding” category depends on how the market evolves.


The Honest Critique

Here is the number that critics emphasize: 13.86%.

That is Devin’s self-reported resolution rate on SWE-Bench — the benchmark that tests AI models on real GitHub issues from open-source repositories. It’s the score Cognition published when Devin launched in March 2024, and it was presented at the time as a milestone (no agent had scored that high before). By January 2025, the Answer.AI independent evaluation found Devin failing 14 of 20 real-world tasks.

By early 2026, models from Anthropic (Claude 3.7 Sonnet: ~49%), OpenAI, and Google had substantially outpaced Devin on SWE-Bench. Devin 2.0 has improved, but Cognition has not published updated benchmark figures.

One framing of this gap: benchmark performance on open-source repositories is not the same as enterprise contract completion on well-defined, bounded tasks with human oversight at key decision points. The Goldman Sachs pilot is not asking Devin to autonomously resolve GitHub issues from public repos. It’s asking Devin to complete specific, scoped tasks inside a managed workflow.

Another framing: the gap is real, and Cognition’s revenue growth is driven by early-market demand in a space where most competitors were slower to ship. As more capable models from Anthropic, OpenAI, and Google become the underlying engines of competing agents, the question is whether Cognition’s product layer — the agent orchestration, the IDE distribution via Windsurf, the enterprise workflow tooling — is defensible on its own.

That’s the question the $25 billion round is implicitly answering yes to.


What to Watch

The round closing. Talks began in late April 2026. If the round closes at or near $25B, it will be one of the highest private valuations in AI history for a company with this revenue profile.

SWE-Bench 2.0 figures. Cognition has been quiet on benchmarks since the original Devin 1 launch. Devin 2.0 with updated scores would either validate or challenge the “benchmark critics don’t understand enterprise” narrative.

Goldman Sachs results. A pilot involving 12,000 developers produces measurable efficiency data. If Goldman discloses those results — in a press release, an investor day, or a CIO interview — they will become the most important third-party validation (or repudiation) of Cognition’s enterprise pitch.

Windsurf 2.0 adoption. Cognition shipped Windsurf 2.0 in April 2026 with Devin built directly into the IDE, an Agent Command Center for managing multiple agent sessions, and SWE-1.5 at 950 tokens/second. The question is whether IDE users convert to Devin Sessions at the rate Cognition’s ARR model requires.

The SpaceX/Cursor close. If the Cursor acquisition closes at $60 billion, it sets a pricing anchor for the category that benefits every independent AI coding company.


Three IOI gold medalists built an AI engineer, watched critics document its 14 failures, and grew it to $73 million ARR in nine months. The critics documented real limitations. The market documented something else.

The $25 billion valuation isn’t a bet that Devin is a perfect coding agent. It’s a bet that the company that figured out how to sell AI labor — at scale, to Goldman Sachs, in the middle of the highest-valued startup decade in history — knows something about where this market is going that the SWE-Bench leaderboard doesn’t capture.

That’s either a very sharp insight or a very expensive one. The Goldman Sachs results will be the first real signal of which.


See also: Our Windsurf 2.0 review — Devin integrated into the IDE, Agent Command Center, and SWE-1.5 at 950 tok/s. And our Cursor 3 review — the competitor that SpaceX agreed to buy for $60B.