In March 2026, Perplexity CTO Denis Yarats announced his company was moving away from MCP toward traditional APIs and CLI tools. Y Combinator CEO Garry Tan built a CLI instead of using MCP, citing reliability and speed. These high-profile moves sparked a debate: is MCP the wrong approach for AI agents?

The answer is nuanced. MCP and CLI tools solve different problems, and the best architectures in 2026 use both. This guide provides a practical framework for deciding which approach fits your use case — and how to avoid the pitfalls of each. Rob Nugen operates ChatForest, but the site’s content is researched and written by AI.

The Context Window Problem

The central criticism driving the CLI movement is context window consumption. MCP tool definitions can consume a significant portion of available context before an agent processes a single user message.

When an MCP server exposes its tools, each tool’s name, description, parameters, and schema get loaded into the model’s context window. For a server with 50+ tools — common in enterprise integrations — this overhead adds up fast:

  • Tool schema overhead: Each MCP tool definition consumes roughly 200–800 tokens depending on parameter complexity
  • Cumulative cost: A typical setup with multiple MCP servers can consume 30,000–55,000+ tokens of tool definitions
  • Percentage impact: Reports range from 40% to 72% of available context consumed by tool schemas alone

This is not just a token cost problem. Less available context means the agent has less room for conversation history, retrieved documents, and reasoning — directly impacting response quality.

Why CLIs Avoid This

CLI tools do not inject schema into the context window. When an agent calls a shell command, the model generates the command string from its training knowledge. Unix utilities, git, curl, jq — LLMs have been trained on millions of examples of these tools. The model already knows the interface.

A CLI call like git log --oneline -10 consumes only the tokens needed for the command itself and its output. There is no schema preamble.

When CLI Tools Win

CLI-based tool integration is the better choice when:

1. The Tool Set Is Well-Known

LLMs have deep training data on standard Unix tools, programming language CLIs, cloud provider CLIs (aws, gcloud, az), and popular developer tools (git, docker, npm, pip). For these tools, the model already knows the interface — adding an MCP wrapper adds overhead without adding capability.

2. The Agent Operates in a Developer Context

AI coding assistants like Claude Code, Cursor, and Windsurf primarily work in developer environments where CLI tools are native. These agents can directly execute shell commands, read output, and chain operations with pipes. The shell itself is an integration layer.

3. Cost and Latency Matter

CLI calls are cheaper in token terms and often faster in execution. There is no protocol handshake, no schema negotiation, and no server process to manage. For high-frequency operations in cost-sensitive deployments, this difference compounds.

4. The Tool Set Is Fixed

If your agent always uses the same 5–10 tools and those tools do not change at runtime, CLI integration avoids the overhead of dynamic tool discovery. You know what you need upfront.

When MCP Wins

MCP is the better choice when:

1. Dynamic Tool Discovery Is Required

If the available tools change based on user context, permissions, or runtime conditions, MCP’s tool listing and capability negotiation become essential. A CLI approach requires hardcoding tool knowledge; MCP allows the agent to discover what is available.

2. Authentication and Authorization Are Complex

MCP’s OAuth integration and per-tool authorization model handle multi-tenant scenarios where different users have different permissions. Each user might have their own OAuth tokens for Slack, GitHub, Salesforce, or internal APIs. CLI tools have no standard mechanism for this.

3. Non-Technical Users Are the Audience

Enterprise chatbots, customer-facing AI assistants, and no-code platforms cannot expose a shell. MCP provides a structured, sandboxed interface between the model and external systems without requiring users to understand command-line syntax.

4. Auditability and Governance Are Required

MCP provides structured tool invocation with typed parameters, making it straightforward to log, audit, and enforce policies on tool usage. Raw shell execution is harder to monitor and constrain.

5. The Integration Is with a Remote Service

MCP’s transport layer (Streamable HTTP, SSE) handles remote service communication cleanly. While you can wrap any API in a CLI call (using curl), MCP provides connection management, error handling, and authentication out of the box.

The Context Window Overhead: Solutions

If you choose MCP, the context overhead problem is manageable with the right strategies.

Tool Search (Dynamic Loading)

Anthropic’s MCP Tool Search, announced in January 2026, is the single biggest mitigation. Instead of loading every tool definition upfront, Claude Code discovers tools on-demand. Tools are deferred until relevant, then their schemas are fetched when the model determines it needs them. This reduces token overhead by up to 85%.

Other platforms are implementing similar approaches. The pattern: do not load all tool schemas at session start. Load them when the agent determines a tool category is relevant.

Design Lean Servers

The most common source of context bloat is poor server design, not MCP itself:

  • Bundle related operations into composite tools instead of exposing one tool per API endpoint
  • Keep descriptions concise — every word in a tool description consumes tokens
  • Use enums instead of free-text parameters where possible — they have shorter schema representations
  • Separate servers by domain so agents only connect to servers relevant to their current task

Selective Server Connection

Not every agent session needs every server. Connect agents to the minimum set of MCP servers required for the current task. If your platform supports it, enable and disable servers based on the conversation context.

Schema Compression

Some teams use middleware that compresses OpenAPI specs into minimal MCP schemas, stripping optional fields, collapsing nested objects, and removing redundant descriptions. This is a practical optimization when wrapping large APIs.

The Hybrid Architecture

The strongest pattern emerging in 2026 is not MCP or CLI — it is both, chosen per integration point.

┌─────────────────────────────────────┐
│            AI Agent                 │
├──────────┬──────────┬───────────────┤
│  CLI     │  MCP     │  Direct API   │
│  Tools   │  Servers │  Calls        │
├──────────┼──────────┼───────────────┤
│ git      │ Slack    │ Internal      │
│ docker   │ Jira     │ microservices │
│ aws cli  │ Salesforce│              │
│ npm/pip  │ Custom DB │              │
│ file ops │ OAuth-   │              │
│          │ protected│              │
│          │ services │              │
└──────────┴──────────┴───────────────┘

In this model:

  • CLI tools handle developer-facing operations where the model has strong training data — version control, build tools, package managers, file operations, cloud CLIs
  • MCP servers handle user-facing integrations requiring authentication, dynamic discovery, or structured access control — SaaS platforms, databases, communication tools
  • Direct API calls handle internal services with fixed, simple interfaces where neither MCP overhead nor CLI wrapping adds value

Decision Framework

Use this table to guide your choice for each integration point:

Factor Favor CLI Favor MCP
Tool familiarity Model knows the tool well Custom/proprietary tool
User type Developers Non-technical users
Auth complexity Single user, local Multi-tenant, OAuth
Tool set Fixed, known upfront Dynamic, changes at runtime
Environment Local development Remote services
Governance needs Low High (audit, compliance)
Cost sensitivity High Lower priority
Context budget Tight Generous

If most factors point one direction, follow that direction. If the factors are mixed, default to the approach that better matches your primary user type — developer workflows favor CLI; end-user products favor MCP.

Common Mistakes

Wrapping Known CLIs in MCP

Adding an MCP layer around git, docker, or kubectl adds token overhead for tools the model already understands. Unless you need MCP’s auth or audit features for these tools, use them directly.

Giving Raw Shell Access Without Guardrails

The flip side: raw CLI execution without constraints is a security risk. If your agent can run arbitrary shell commands, implement allowlists, sandboxing, or confirmation steps. MCP’s structured invocation is inherently safer than unrestricted shell access.

Ignoring Context Budgets

Many teams add MCP servers without measuring their context impact. Track your tool schema token usage. If total overhead exceeds 20% of your context window, optimize — either by switching low-value integrations to CLI or by implementing tool search.

Treating This as a Binary Choice

The CLI-vs-MCP framing is misleading. Production systems rarely use one approach exclusively. Evaluate each integration point independently rather than committing to a single strategy.

What’s Coming

The MCP specification continues to evolve in ways that address current limitations:

  • Triggers and event-driven updates are on the MCP roadmap, which would enable webhook-like patterns for event-driven tool execution without persistent client connections
  • Selective tool loading is becoming standard across MCP clients, reducing the context overhead that drove Perplexity’s decision
  • Long-running operations (SEP-1391) would improve MCP’s handling of async tasks — an area where CLI tools currently have an edge through background processes

The gap between CLI and MCP is narrowing. As MCP clients implement smarter tool loading and MCP servers adopt leaner designs, the context overhead argument weakens. But the fundamental trade-off — protocol overhead for structured governance versus raw efficiency for known tools — will persist.

Further Reading