Category: AI & ML Tools

The Sequential Thinking MCP server (@modelcontextprotocol/server-sequential-thinking) is Anthropic’s official reference implementation for structured AI reasoning. It provides a single tool — sequential_thinking — that lets agents break complex problems into numbered steps, revise earlier reasoning, and branch into alternative paths. The idea is to make thinking visible and controllable: instead of an agent jumping straight to an answer, it works through the problem step by step, and you can see (and steer) each step.

At a glance: 84,350+ stars (monorepo), ~72K weekly npm downloads (down ~30% from April’s ~103K peak), v2025.12.18, 1 tool, Apache 2.0 license, ~4.7M+ all-time PulseMCP visitors.

It lives in the main modelcontextprotocol/servers monorepo — not archived — alongside Memory, Filesystem, Fetch, and the other official reference servers. That’s serious adoption for a tool that, on the surface, doesn’t connect to any external service — it just helps agents think.

But this server exists in an increasingly awkward position. When it launched in December 2024, giving models a structured space to reason was genuinely novel. Since then, Claude has gained extended thinking, GPT models have added chain-of-thought reasoning, and Anthropic themselves published a “think” tool pattern that achieves similar goals without an MCP server. The question isn’t whether structured thinking helps — it does. The question is whether you need a separate MCP server for it in 2026.

What’s New (May 2026 Update)

Downloads dropped ~30% from April’s peak. Weekly npm downloads fell from ~103K (April 22) to ~72K as of May 2026 — a notable pullback after the March/April rebound. Monthly totals were 411K in March (peak) and ~390K in April; May is tracking lower. Whether this is noise or trend is unclear, but the April “stabilization” narrative now looks more like a temporary plateau. The MCP SDK crossed 97 million cumulative downloads ecosystem-wide, suggesting the total MCP pie is growing even as this server’s share shifts.

Still no new npm release — 5+ months and counting. The latest published version remains v2025.12.18, published December 18, 2025. Fixes from March 2026 (type coercion, tool annotations) and the April docs expansion still haven’t reached npm. The gap between what’s in the repo and what’s published continues to widen.

Memory leak fix still unmerged. PR #3321 (opened February 11, 2026) — proposing configurable history limits via SEQUENTIAL_THINKING_MAX_HISTORY and a clearHistory() method — has now sat open for over 3 months despite passing all 17 tests. In sessions running 6-8+ hours, RAM can hit 10GB+. For long-running agent workflows, this remains an unresolved production risk.

Anthropic still recommends extended thinking over the think tool. Their December 2025 update stated: “We recommend using [extended thinking] instead of a dedicated think tool in most cases.” The think tool (and by extension, Sequential Thinking) remains recommended only for complex tool chains, policy-heavy environments, and sequential multi-step decisions.

Prior updates (March–April 2026): Tool annotations added (PR #3534), better type coercion for LLM-sent parameters (PR #3533), license transitioned to Apache 2.0. Documentation expanded April 8 with practical usage examples and MCP host integration guidance. PR #4005 proposed ANSI rendering fixes for box-drawing bugs in diagnostic output. Thought count constraint request closed as “not planned” on April 20.

What It Does

The server exposes exactly one tool:

sequential_thinking — Process a single thought in a sequence. The agent calls this tool repeatedly, once per thought step, building up a chain of reasoning.

Required parameters:

  • thought (string) — The content of the current thinking step
  • thoughtNumber (integer) — Current position in the sequence (1, 2, 3…)
  • totalThoughts (integer) — Estimated total steps needed (can be adjusted dynamically)
  • nextThoughtNeeded (boolean) — Whether another step follows

Optional parameters for advanced reasoning:

  • isRevision (boolean) — Marks this thought as a revision of earlier reasoning
  • revisesThought (integer) — Which earlier thought is being reconsidered
  • branchFromThought (integer) — Starting point for an alternative reasoning path
  • branchId (string) — Identifier for the alternative branch
  • needsMoreThoughts (boolean) — Signals that the initial estimate was too low

The server tracks all thoughts, revisions, and branches in memory. When the agent sets nextThoughtNeeded to false, the server returns a summary of the complete reasoning chain.

Setup

For Claude Desktop:

{
  "mcpServers": {
    "sequential-thinking": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-sequential-thinking"]
    }
  }
}

Docker:

{
  "mcpServers": {
    "sequential-thinking": {
      "command": "docker",
      "args": ["run", "--rm", "-i", "mcp/sequentialthinking"]
    }
  }
}

Optional environment variables:

  • DISABLE_THOUGHT_LOGGING — Set to true to suppress thought logging to stderr
  • MAX_TOTAL_THOUGHTS — Cap the maximum number of thoughts (unlimited by default)

Requirements: Node.js 18+ or Docker. No API keys, no accounts, no external services.

Setup difficulty: Very easy. One line in your MCP config. No configuration needed beyond the optional environment variables.

What Works Well

The branching and revision model is genuinely elegant. Unlike simple chain-of-thought prompting, the server tracks branching and revision as first-class concepts. An agent can explore one approach for five steps, realize it’s wrong, revise step 3, or branch from step 2 into an entirely different line of reasoning. The server maintains the full tree of thoughts — main line, branches, and revisions — giving you a complete record of how the agent reasoned through the problem.

Dynamic thought adjustment prevents premature stopping. The totalThoughts parameter is an estimate, not a limit. If the agent realizes mid-reasoning that a problem is more complex than expected, it can increase the count or set needsMoreThoughts to true. This prevents the common problem where an agent commits to “I’ll solve this in 5 steps” and then rushes through an inadequate answer. Combined with MAX_TOTAL_THOUGHTS, you get flexibility with a safety net.

The reasoning trace is visible and auditable. Every thought, revision, and branch is tracked and returned. For debugging agent behavior — understanding why an agent made a particular decision — this is valuable. You can see exactly where reasoning went wrong, which branches were explored, and what was revised. This matters for complex planning tasks where the final answer alone isn’t enough.

Zero-dependency simplicity. No API keys, no accounts, no cloud services, no persistent storage. The server runs locally, processes thoughts in memory, and produces a reasoning trace. In a landscape where most MCP servers require authentication, billing accounts, or external services, this simplicity is refreshing.

Still maintained in the official monorepo. Unlike Puppeteer and SQLite (moved to servers-archived), Sequential Thinking remains in the active modelcontextprotocol/servers repo (84,350+ stars). It received two fixes in March 2026 (type coercion, tool annotations), a docs expansion in April, and has an active rendering fix PR. The license transitioned to Apache 2.0 in January 2026. Version 2025.12.18 is the latest npm release.

What Doesn’t Work Well

Anthropic now explicitly recommends against it for most use cases. In December 2025, Anthropic updated their think tool blog post to state: “We recommend using [extended thinking] instead of a dedicated think tool in most cases.” That’s a stronger position than the original March 2025 post, which presented the think tool and extended thinking as complementary. Downloads dipped after that update but rebounded in March 2026. The sequential thinking server was pioneering in late 2024; in 2026, its creators are steering users elsewhere even as downloads remain strong.

Memory leak in long-running sessions. PR #3321 (February 2026, still open) documents that thoughtHistory and branches arrays grow without bound. In sessions running 6-8+ hours, RAM consumption can hit 10GB+. A fix proposing configurable limits and a clearHistory() method has passed all 17 tests but remains unmerged. This primarily affects long-running agent workflows, not short interactive sessions.

Every thought step is a separate tool call. A 10-step reasoning chain means 10 tool calls. Each call carries serialization overhead, schema validation, and round-trip latency. Analysis from the community shows this makes token consumption 3-6x higher compared to letting the model reason in its context window. For agents paying per token, this overhead adds up quickly.

LLMs struggle with the type constraints. The thoughtNumber and totalThoughts parameters require integers, but LLMs frequently send strings like "1" instead of 1. This caused enough validation failures (#2598 — closed via improved parameter descriptions, #2905 — closed as duplicate) that the team had to add string-to-number coercion. The tool description itself has been flagged as too long for OpenAI models (2,780 characters vs. 1,024-character limit — still open). When the tool is hard for its target users (LLMs) to use correctly, that’s a fundamental design friction.

Agents underutilize the advanced features. Community analysis (#2332 — closed after fork implemented fixes) found that AI systems use the tool effectively for linear step-by-step reasoning but rarely use isRevision, branchFromThought, or branchId. The elegant branching model exists in theory; in practice, agents mostly just count from 1 to N linearly. The features that differentiate this from simple chain-of-thought prompting are the same features that agents don’t use.

Session stickiness in Claude Code. Users reported (#713 — closed April 2025) that once sequential thinking is invoked in a Claude Code session, the agent continues using it for every subsequent action — even when it’s unnecessary. You end up with 8-step reasoning chains for simple tasks that should take one step. Clearing the context is the only workaround.

No persistence of reasoning chains. Thoughts exist only in server memory for the duration of the session. Once the server stops, all reasoning traces are gone. If you want to review or compare reasoning across sessions, you need to capture the output yourself. For a tool whose main value is making reasoning visible, the inability to save that reasoning is a gap.

The totalThoughts parameter is uncontrollable by default. Users tried command-line arguments, environment variables, and configuration options to constrain thought counts (#2226closed April 20 as “not planned”). The MAX_TOTAL_THOUGHTS environment variable was eventually added as a workaround, but by default there’s no limit — an agent can generate dozens of thought steps, consuming tokens and time without constraint. The maintainers closing the configuration request as “not planned” signals this is by design, not a gap they intend to fill.

Compared to Alternatives

vs. Claude’s Extended Thinking: Extended thinking is built into the model — no MCP server, no tool call overhead, no type validation issues. It activates before response generation and provides deep reasoning natively. As of December 2025, Anthropic explicitly recommends extended thinking over external thinking tools for most use cases. Sequential Thinking’s remaining advantage is visibility (you see each step as a tool call) and branching (you can explore alternatives). But for raw reasoning quality and efficiency, extended thinking wins — and Anthropic says so.

vs. Anthropic’s “Think” Tool Pattern: The think tool is a simpler approach — a tool that accepts a single thought string with no schema, no step counting, no branching. It’s designed for pausing during complex tool chains, not for structured multi-step reasoning. Anthropic’s benchmark data shows it improves Claude’s performance by 54% on complex airline customer service tasks. Anthropic’s December 2025 update narrows the think tool’s recommended use cases to: complex tool chains, policy-heavy environments, and sequential decisions where each step builds on previous ones. For everything else, they now recommend extended thinking.

vs. Community Forks and Alternatives: Multiple community alternatives have emerged — Clear Thought for structured reasoning, MCP Feedback Enhanced for human-in-the-loop checkpoints during reasoning, cgize/claude-mcp-think-tool as a community think tool implementation, and spences10’s Sequential Thinking Tools optimized for programming tasks. LangGPT and FradSer’s multi-agent version pass each thought through multiple specialized AI agents for deeper analysis, at the cost of 3-6x token consumption. These forks address real gaps but fragment the ecosystem.

vs. Our Memory MCP Server review: Both are official reference servers from the modelcontextprotocol/servers monorepo. Memory solves persistent context; Sequential Thinking solves structured reasoning. Memory has 9 tools for a knowledge graph; Sequential Thinking has 1 tool for thought chains. Both are still maintained (not archived), both are increasingly competing with capabilities built into the models themselves.

Who Should Use This

Yes, use it if:

  • You need an auditable reasoning trace — understanding how an agent reached a conclusion, not just what it concluded
  • You’re building agent workflows where reasoning visibility matters for debugging or compliance
  • You want to experiment with branching and revision in agent reasoning
  • Your MCP client doesn’t support extended thinking (some clients only support tool calls)
  • You’re teaching or demonstrating structured reasoning concepts

Don’t use it if:

  • You’re using a model with built-in extended thinking or reasoning capabilities (you probably don’t need this)
  • Token cost matters — each thought step is a separate tool call with overhead
  • You need persistent reasoning traces (nothing is saved between sessions)
  • You want agents to actually use branching and revision (they mostly don’t)
  • You’re using Claude Code (session stickiness will frustrate you)
3 / 5 — Pioneering concept, uncertain future
The Sequential Thinking MCP server introduced an important idea: making AI reasoning structured, visible, and controllable. The branching model is elegant, the thought revision concept is sound, and the zero-dependency setup is as simple as MCP servers get. But the May 2026 picture is less encouraging than April’s. Weekly npm downloads dropped ~30% to ~72K after the March/April plateau at ~103K. No new npm release in 5+ months (v2025.12.18, December 2025) — fixes from March and April remain unpublished. The memory leak fix has sat unmerged for 3+ months. The thought count constraint request was closed as “not planned”. Agents still rarely use the branching and revision features that justify this over simpler approaches. It’s still maintained — it hasn’t been archived like Puppeteer or SQLite — but the combination of declining downloads, no npm release, and an unmerged memory fix are warning signs. For auditable reasoning traces, MCP clients without extended thinking support, or debugging agent decision-making, it remains the right choice. But its creators are pointing users elsewhere, and the download numbers in May are starting to follow.

ChatForest does not test MCP servers hands-on. Our reviews are based on documentation analysis, source code review, community feedback, and public data. Learn more about our methodology.

This review was last updated on 2026-05-20 using Claude Sonnet 4.6 (Anthropic).