Most MCP deployments start the same way: one AI client, one MCP server, one connection. Then the catalog grows. Three servers, then ten, then thirty. Each with its own auth model, transport config, and tool namespace. Each connection managed individually by every client. Each team maintaining its own copy of the configuration. This is the scaling problem IBM ContextForge was built to solve.
ContextForge is an open source AI gateway, registry, and proxy that sits in front of any MCP, A2A, or REST/gRPC API and exposes a single unified endpoint to AI clients. It adds centralized governance, tool discovery, access control, rate limiting, observability, and multi-tenancy. As of v1.0.0 GA (May 2026), it is the highest-starred IBM open source project in the MCP ecosystem at 3,655 GitHub stars — and one of the most architecturally complete MCP gateways available.
This is not an IBM product connector. It does not require IBM Cloud, watsonx, or any IBM software. ContextForge is protocol-agnostic infrastructure for any MCP deployment. Part of our Developer Tools MCP category.
At a Glance
| Repo | IBM/mcp-context-forge |
| Stars | ~3,655 |
| License | Apache-2.0 |
| Language | Python (with Rust-powered JSON components) |
| Install | pip install mcp-contextforge-gateway |
| Docs | ibm.github.io/mcp-context-forge |
| Latest version | v1.0.0 GA (May 2026) |
| First stable | v0.5.0 (August 2025) |
| Multi-arch | x86_64, IBM Z (s390x), IBM POWER (ppc64le) |
What ContextForge Does
ContextForge operates as three gateways in one:
Tools Gateway — Aggregates multiple MCP servers behind a single endpoint. AI clients configure one connection; ContextForge manages upstream MCP connections, handles tool routing, and applies policies. REST and gRPC APIs are translated to MCP tools automatically, so existing services become callable by any MCP-compatible agent without code changes. TOON compression reduces LLM token usage by 30-70% by compressing tool schemas and descriptions before sending them to the model context.
Agent Gateway — Routes agent-to-agent (A2A protocol) communication alongside MCP tool calls. Supports OpenAI-compatible and Anthropic agent routing, making ContextForge a neutral protocol bridge across agent frameworks.
API Gateway — Provides rate limiting, authentication, retries, and reverse proxy for REST services, turning ContextForge into the policy enforcement layer for all AI agent-to-service traffic in an organization.
These three modes can run simultaneously or independently, depending on deployment requirements.
Key Features
Virtual Servers
Virtual servers are the core organizational primitive in ContextForge. Each virtual server exposes a curated subset of tools from the upstream catalog with its own access policies and API keys. A typical enterprise deployment might have:
- A Developer virtual server — all tools, for authenticated internal users
- A Customer service virtual server — only CRM and order management tools
- A Read-only audit virtual server — all tools, but write operations blocked by policy
Clients connect to the virtual server endpoint, not to individual MCP servers. Tool namespacing handles collisions when multiple upstream servers expose similarly named tools.
Multi-Tenant RBAC
Full multi-tenancy has been available since v0.7.0. The system supports teams, email authentication, role-based access control (RBAC) for Admin UI and API routes, per-virtual-server API keys, and resource visibility controls. Different groups in an organization get isolated tool catalogs with their own policy boundaries, while the gateway provides shared observability.
TOON Compression
TOON (Tool Object Optimization for Networks) compression reduces the token footprint of tool schemas sent to the LLM context window. The 30-70% reduction directly translates to lower API costs and fewer context window pressure issues in agents loading large tool catalogs. No other major MCP gateway has an equivalent feature.
40+ Plugins
The plugin ecosystem covers:
- Security — PII detection, content filtering, prompt injection detection
- Rate limiting — per-client, per-tool, per-virtual-server quotas
- Protocol translation — REST-to-MCP, gRPC-to-MCP, custom transports
- Caching — L1 (in-memory) and L2 (Redis) for tool call deduplication and response caching
- Pre/post hooks — custom logic at every stage of the tool call pipeline
- Authentication — JWT, OAuth, API key, and custom auth adapters
Plugins apply as pre- and post-hooks on every request, turning the gateway into a policy engine that operates across all MCP traffic centrally rather than per-server.
Observability
OpenTelemetry tracing is built in with support for Phoenix (for LLM-native traces), Jaeger, Zipkin, and any OTLP-compatible backend. Every tool call generates a trace span with latency, status, and policy decision data. For organizations already running distributed tracing infrastructure, ContextForge plugs directly into the existing observability stack.
Compression
Network response compression is automatic: Brotli (best compression), Zstd (fastest), and GZip (universal fallback), negotiated per client via Accept-Encoding. Text-based responses (JSON, tool output) compress 30-70% in transit, separate from TOON’s LLM-side compression.
Deployment
ContextForge ships four packaging options:
| Method | Use case |
|---|---|
pip install mcp-contextforge-gateway |
Development, quick start |
| Docker / container images | Staging, single-instance production |
| Binaries | Air-gapped environments |
| Helm charts | Kubernetes HA production |
Database backends: SQLite for development and single-instance deployments; PostgreSQL (with PgBouncer connection pooling) for production.
Kubernetes HA: Production deployments run multiple ContextForge instances with Redis-backed state sharing, federation, load balancing, and automatic failover. The Helm chart includes auto-scaling configuration for handling variable tool call load.
Air-gapped: Since v1.0.0-BETA-1, ContextForge supports fully air-gapped deployments with no external connectivity required — critical for regulated industries and classified environments.
Multi-architecture: Docker images ship for x86_64, IBM Z (s390x), and IBM POWER (ppc64le) — covering both cloud-native deployments and IBM mainframe/POWER environments that have no equivalent support in competing gateways.
Performance
Performance engineering became a priority from v1.0.0-BETA-2 onward:
- N+1 query elimination — database query patterns rebuilt for bulk operations
- PgBouncer connection pooling — eliminates per-request PostgreSQL connection overhead
- L1/L2 caching — in-memory cache for hot tool calls, Redis for shared cache across instances
- Granian HTTP server — Rust-powered ASGI server replacing Uvicorn in production
- orjson serialization — Rust-powered JSON parsing and serialization delivering 5-6x faster serialization and 1.5-2x faster deserialization versus Python’s standard library
The v1.0.0-BETA-2 release note cited “100+ performance optimizations” and 80+ bug fixes. The pattern: early releases (v0.4.0–v0.7.0) built enterprise features; later releases (v1.0.0-BETA series) hardened them for production throughput.
Version History
| Version | Date | Highlights |
|---|---|---|
| v0.5.0 | August 2025 | Enterprise auth, configuration management, observability |
| v0.7.0 | Late 2025 | Multi-tenant RBAC, teams, email auth |
| v1.0.0-BETA-1 | December 16, 2025 | Multi-arch containers, gRPC-to-MCP, air-gapped deployment |
| v1.0.0-BETA-2 | January 24, 2026 | 100+ performance optimizations, PgBouncer, Granian, orjson |
| v1.0.0-RC-3 | Early 2026 | Auth/RBAC hardening, experimental Rust MCP runtime, s390x/ppc64le |
| v1.0.0 GA | May 2026 | Production release — CVE tracking begins |
CVE identifiers are formally assigned starting with v1.0.0 GA, signaling that IBM treats this as enterprise software with a security support commitment.
Context: MCP Gateway Landscape
ContextForge is not alone in the MCP gateway space, but it occupies a distinctive position:
| Gateway | Stars | Key differentiator |
|---|---|---|
| IBM ContextForge | ~3,655 | TOON compression, A2A+MCP+REST+gRPC, 40+ plugins, IBM Z/POWER |
| AgentGateway (Salesforce) | ~2,457 | Covered in API Gateway roundup |
| Microsoft Kubernetes MCP Gateway | ~595 | Azure-native, Kubernetes-native |
| Kong Agent Gateway | Enterprise | MCP+A2A+LLM unified governance, covered in API Gateway roundup |
| Higress (Alibaba) | ~8,300 | Native MCP server hosting in a production API gateway |
ContextForge is the only gateway in this group that supports A2A agent routing alongside MCP federation in a vendor-neutral, self-hosted package. Higress is larger but serves a different pattern (hosting MCP servers inside a traditional API gateway rather than federating them). AgentGateway has fewer features for the multi-tenant enterprise use case.
Limitations
Python runtime. Most MCP infrastructure tools use TypeScript or Go for lower overhead. Python carries higher memory footprint and startup cost per instance, partially mitigated by the Granian HTTP server and orjson. The experimental Rust MCP runtime (introduced in RC-3) suggests IBM is aware of this — but it has not shipped in a stable release yet.
Complexity budget. ContextForge solves a coordination problem that only appears at scale. For teams running 3-5 MCP servers, the gateway adds operational complexity (another service to deploy, monitor, and maintain) without proportional benefit. The tool is well-suited to organizations with 10+ MCP servers, multiple teams, and governance requirements.
IBM branding vs. content. Despite being protocol-agnostic, the IBM association creates perception friction for non-IBM shops. The documentation, docs site, and workshop are all high quality — but the repo name and organization may cause teams to filter it out during evaluation.
Workshop dependency. The ContextForge Workshop is comprehensive and well-maintained, but it lives in a separate GitHub organization (contextforge-org), creating a minor discovery fragmentation.
Who Should Use This
Good fit:
- Organizations running 10+ MCP servers that need unified access control and discovery
- Teams with compliance requirements for audit logging and policy enforcement on all AI tool calls
- Enterprise environments needing multi-tenant isolation (different teams, different tool subsets)
- Deployments on IBM Z or POWER infrastructure
- Cost-sensitive deployments where TOON token compression meaningfully reduces LLM API bills
Not the right tool:
- Small deployments with 2-3 MCP servers and no governance requirements
- Organizations that need a hosted/managed gateway (ContextForge is self-hosted only)
- Teams without Kubernetes expertise (for production HA deployments)
Rating: 4.0/5
ContextForge is the most complete self-hosted MCP gateway in the open source ecosystem. TOON compression, A2A federation, 40+ plugins, and multi-architecture container support solve real enterprise problems with no direct equivalent in competing tools. The documentation is extensive, the workshop is hands-on, and the v1.0.0 GA release (with formal CVE tracking) marks this as production software, not a side project.
The deduction: Python runtime carries inherent overhead that matters at scale (partially mitigated, not solved), and the tool’s value proposition only materializes for organizations with significant MCP catalog complexity. For smaller deployments, simpler point solutions cost less operationally.
For enterprises already managing multi-team AI agent deployments with governance requirements, ContextForge is the clearest architectural fit in the MCP gateway category.
Related:
- IBM MCP Servers — IBM product connectors: watsonx.data, IBM i, QRadar, FileNet, and more
- API Gateway MCP Servers — Kong, Cloudflare, AgentGateway, and Microsoft’s MCP Gateway
- API Gateway & Management MCP Servers — Higress, Postman, OpenAPI-to-MCP tooling
- Agent Orchestration MCP Servers — A2A, CrewAI, LangChain MCP integrations