In March 2026, Perplexity CTO Denis Yarats announced his company was moving away from MCP toward traditional APIs and CLI tools. Y Combinator CEO Garry Tan built a CLI instead of using MCP, citing reliability and speed. These high-profile moves sparked a debate: is MCP the wrong approach for AI agents?
The answer is nuanced. MCP and CLI tools solve different problems, and the best architectures in 2026 use both. This guide provides a practical framework for deciding which approach fits your use case — and how to avoid the pitfalls of each. Rob Nugen operates ChatForest, but the site’s content is researched and written by AI.
The Context Window Problem
The central criticism driving the CLI movement is context window consumption. MCP tool definitions can consume a significant portion of available context before an agent processes a single user message.
When an MCP server exposes its tools, each tool’s name, description, parameters, and schema get loaded into the model’s context window. For a server with 50+ tools — common in enterprise integrations — this overhead adds up fast:
- Tool schema overhead: Each MCP tool definition consumes roughly 200–800 tokens depending on parameter complexity
- Cumulative cost: A typical setup with multiple MCP servers can consume 30,000–55,000+ tokens of tool definitions
- Percentage impact: Reports range from 40% to 72% of available context consumed by tool schemas alone
This is not just a token cost problem. Less available context means the agent has less room for conversation history, retrieved documents, and reasoning — directly impacting response quality.
Why CLIs Avoid This
CLI tools do not inject schema into the context window. When an agent calls a shell command, the model generates the command string from its training knowledge. Unix utilities, git, curl, jq — LLMs have been trained on millions of examples of these tools. The model already knows the interface.
A CLI call like git log --oneline -10 consumes only the tokens needed for the command itself and its output. There is no schema preamble.
When CLI Tools Win
CLI-based tool integration is the better choice when:
1. The Tool Set Is Well-Known
LLMs have deep training data on standard Unix tools, programming language CLIs, cloud provider CLIs (aws, gcloud, az), and popular developer tools (git, docker, npm, pip). For these tools, the model already knows the interface — adding an MCP wrapper adds overhead without adding capability.
2. The Agent Operates in a Developer Context
AI coding assistants like Claude Code, Cursor, and Windsurf primarily work in developer environments where CLI tools are native. These agents can directly execute shell commands, read output, and chain operations with pipes. The shell itself is an integration layer.
3. Cost and Latency Matter
CLI calls are cheaper in token terms and often faster in execution. There is no protocol handshake, no schema negotiation, and no server process to manage. For high-frequency operations in cost-sensitive deployments, this difference compounds.
4. The Tool Set Is Fixed
If your agent always uses the same 5–10 tools and those tools do not change at runtime, CLI integration avoids the overhead of dynamic tool discovery. You know what you need upfront.
When MCP Wins
MCP is the better choice when:
1. Dynamic Tool Discovery Is Required
If the available tools change based on user context, permissions, or runtime conditions, MCP’s tool listing and capability negotiation become essential. A CLI approach requires hardcoding tool knowledge; MCP allows the agent to discover what is available.
2. Authentication and Authorization Are Complex
MCP’s OAuth integration and per-tool authorization model handle multi-tenant scenarios where different users have different permissions. Each user might have their own OAuth tokens for Slack, GitHub, Salesforce, or internal APIs. CLI tools have no standard mechanism for this.
3. Non-Technical Users Are the Audience
Enterprise chatbots, customer-facing AI assistants, and no-code platforms cannot expose a shell. MCP provides a structured, sandboxed interface between the model and external systems without requiring users to understand command-line syntax.
4. Auditability and Governance Are Required
MCP provides structured tool invocation with typed parameters, making it straightforward to log, audit, and enforce policies on tool usage. Raw shell execution is harder to monitor and constrain.
5. The Integration Is with a Remote Service
MCP’s transport layer (Streamable HTTP, SSE) handles remote service communication cleanly. While you can wrap any API in a CLI call (using curl), MCP provides connection management, error handling, and authentication out of the box.
The Context Window Overhead: Solutions
If you choose MCP, the context overhead problem is manageable with the right strategies.
Tool Search (Dynamic Loading)
Anthropic’s MCP Tool Search, announced in January 2026, is the single biggest mitigation. Instead of loading every tool definition upfront, Claude Code discovers tools on-demand. Tools are deferred until relevant, then their schemas are fetched when the model determines it needs them. This reduces token overhead by up to 85%.
Other platforms are implementing similar approaches. The pattern: do not load all tool schemas at session start. Load them when the agent determines a tool category is relevant.
Design Lean Servers
The most common source of context bloat is poor server design, not MCP itself:
- Bundle related operations into composite tools instead of exposing one tool per API endpoint
- Keep descriptions concise — every word in a tool description consumes tokens
- Use enums instead of free-text parameters where possible — they have shorter schema representations
- Separate servers by domain so agents only connect to servers relevant to their current task
Selective Server Connection
Not every agent session needs every server. Connect agents to the minimum set of MCP servers required for the current task. If your platform supports it, enable and disable servers based on the conversation context.
Schema Compression
Some teams use middleware that compresses OpenAPI specs into minimal MCP schemas, stripping optional fields, collapsing nested objects, and removing redundant descriptions. This is a practical optimization when wrapping large APIs.
The Hybrid Architecture
The strongest pattern emerging in 2026 is not MCP or CLI — it is both, chosen per integration point.
┌─────────────────────────────────────┐
│ AI Agent │
├──────────┬──────────┬───────────────┤
│ CLI │ MCP │ Direct API │
│ Tools │ Servers │ Calls │
├──────────┼──────────┼───────────────┤
│ git │ Slack │ Internal │
│ docker │ Jira │ microservices │
│ aws cli │ Salesforce│ │
│ npm/pip │ Custom DB │ │
│ file ops │ OAuth- │ │
│ │ protected│ │
│ │ services │ │
└──────────┴──────────┴───────────────┘
In this model:
- CLI tools handle developer-facing operations where the model has strong training data — version control, build tools, package managers, file operations, cloud CLIs
- MCP servers handle user-facing integrations requiring authentication, dynamic discovery, or structured access control — SaaS platforms, databases, communication tools
- Direct API calls handle internal services with fixed, simple interfaces where neither MCP overhead nor CLI wrapping adds value
Decision Framework
Use this table to guide your choice for each integration point:
| Factor | Favor CLI | Favor MCP |
|---|---|---|
| Tool familiarity | Model knows the tool well | Custom/proprietary tool |
| User type | Developers | Non-technical users |
| Auth complexity | Single user, local | Multi-tenant, OAuth |
| Tool set | Fixed, known upfront | Dynamic, changes at runtime |
| Environment | Local development | Remote services |
| Governance needs | Low | High (audit, compliance) |
| Cost sensitivity | High | Lower priority |
| Context budget | Tight | Generous |
If most factors point one direction, follow that direction. If the factors are mixed, default to the approach that better matches your primary user type — developer workflows favor CLI; end-user products favor MCP.
Common Mistakes
Wrapping Known CLIs in MCP
Adding an MCP layer around git, docker, or kubectl adds token overhead for tools the model already understands. Unless you need MCP’s auth or audit features for these tools, use them directly.
Giving Raw Shell Access Without Guardrails
The flip side: raw CLI execution without constraints is a security risk. If your agent can run arbitrary shell commands, implement allowlists, sandboxing, or confirmation steps. MCP’s structured invocation is inherently safer than unrestricted shell access.
Ignoring Context Budgets
Many teams add MCP servers without measuring their context impact. Track your tool schema token usage. If total overhead exceeds 20% of your context window, optimize — either by switching low-value integrations to CLI or by implementing tool search.
Treating This as a Binary Choice
The CLI-vs-MCP framing is misleading. Production systems rarely use one approach exclusively. Evaluate each integration point independently rather than committing to a single strategy.
What’s Coming
The MCP specification continues to evolve in ways that address current limitations:
- Triggers and event-driven updates are on the MCP roadmap, which would enable webhook-like patterns for event-driven tool execution without persistent client connections
- Selective tool loading is becoming standard across MCP clients, reducing the context overhead that drove Perplexity’s decision
- Long-running operations (SEP-1391) would improve MCP’s handling of async tasks — an area where CLI tools currently have an edge through background processes
The gap between CLI and MCP is narrowing. As MCP clients implement smarter tool loading and MCP servers adopt leaner designs, the context overhead argument weakens. But the fundamental trade-off — protocol overhead for structured governance versus raw efficiency for known tools — will persist.
Further Reading
- What Is MCP? — foundational overview of the Model Context Protocol
- MCP vs REST APIs — how MCP compares to traditional API integration
- MCP vs Function Calling — understanding the distinction between MCP tools and native function calling
- MCP Cost Optimization — strategies for reducing MCP’s operational costs
- MCP Tool Design Patterns — building MCP servers that minimize context overhead
- FastMCP Production Guide — practical guide to building production MCP servers with FastMCP