Vector database MCP servers let AI agents store embeddings, run semantic searches, manage collections, and build retrieval-augmented generation (RAG) pipelines — all through the Model Context Protocol. Instead of writing Python scripts to initialize clients, create indexes, and query vectors, your agent handles these operations conversationally. Part of our Databases MCP category.
This is the category that makes AI memory and knowledge retrieval work. Vector databases are the backbone of RAG, agent memory, semantic search, and recommendation systems. Every major vector database vendor now ships an official MCP server — and one (Weaviate) now embeds MCP directly into the database itself.
This review covers MCP servers for dedicated vector databases (Qdrant, Chroma, Milvus, Pinecone, Weaviate, LanceDB, Turbopuffer), vector-enabled traditional databases (Redis, pgvector for PostgreSQL, MongoDB Atlas), and RAG-focused servers that combine vector storage with document processing. For search engines with vector capabilities (Elasticsearch, OpenSearch, Meilisearch), see our Search Engine MCP Servers review. For individual deep-dives, see our reviews of Qdrant, Chroma, Pinecone, and Milvus.
The headline findings: Redis dominates adoption (4,200 stars) with official vector search plus full data management. Weaviate pioneers database-native MCP with a built-in server in v1.37. Qdrant leads dedicated vector DB adoption (1,400 stars) with configurable filters. Chroma offers the most tools (13) across four deployment modes. Milvus powers the category’s biggest project — zilliztech/claude-context at 9,800 stars. Turbopuffer and Pinecone Assistant fill key gaps. Every major vendor ships an official server — a maturity signal few MCP categories can match.
The Dedicated Vector Database Servers
Qdrant — Most Adopted Dedicated Vector DB Server
| Server | Stars | Forks | Language | Official |
|---|---|---|---|---|
| mcp-server-qdrant | 1,400 | 239+ | Python | Yes |
The most popular dedicated vector database MCP server. Qdrant’s server exposes 2 core tools: qdrant-store (save information with metadata) and qdrant-find (retrieve by semantic similarity), now with configurable filters on the find tool for more flexible search.
This is a deliberate design choice. The server positions itself as a semantic memory layer — giving AI agents persistent memory across conversations rather than full database management. It’s the only vector DB MCP server supporting all three transport protocols: stdio, SSE, and Streamable HTTP. The server is now represented by a QdrantMCPServer class that can be inherited to build project-oriented MCP servers — enabling code search, knowledge bases, or domain-specific memory without forking the project.
Qdrant also ships mcp-for-docs — a proof-of-concept MCP server for curated documentation search, demonstrating how to build specialized read-only MCP servers on top of the Qdrant platform.
Best for: Agent memory, conversation persistence, teams that want the simplest possible setup. Qdrant itself has 23,000+ stars and is one of the most performant vector search engines available (Rust-based).
Limitation: If you need to manage collections, tune indexes, delete records, or do anything beyond store-and-find, you’ll need to use the Qdrant API directly.
Full review: Qdrant MCP Server — Rating: 3/5
Chroma — Most Comprehensive Tool Set
| Server | Stars | Forks | Language | Official |
|---|---|---|---|---|
| chroma-mcp | 540 | — | Python | Yes |
13 tools make this the most feature-complete vector database MCP server. Eight tools handle collection management (create, list, peek, info, count, modify, delete, fork) and five handle document operations (add, query, get, update, delete).
What sets Chroma apart is four deployment modes — ephemeral (in-memory for testing), persistent (local storage), self-hosted HTTP server, and Chroma Cloud — all configurable through the same MCP server. Plus support for six embedding providers (OpenAI, Cohere, Google, Jina, Voyager AI, Roboflow).
Best for: Teams that want full database management through their AI agent, developers prototyping RAG applications, workflows that need to create/modify/delete collections conversationally.
Limitation: The breadth of tools can overwhelm agents — LLMs sometimes pick the wrong tool when 13 are available. Chroma itself is best for small-to-medium scale (not billions of vectors).
Full review: Chroma MCP Server — Rating: 3.5/5
Milvus — Strongest Self-Hosted Option + Expanding Ecosystem
| Server | Stars | Forks | Language | Official |
|---|---|---|---|---|
| mcp-server-milvus | 230 | 63+ | Python | Yes |
| zilliz-mcp-server | — | — | Python | Yes |
| claude-context | 9,800 | — | TypeScript | Yes (Zilliz) |
12 tools with the unique strength of five search types: text search, vector search, hybrid search (text + vector combined), text similarity search, and filter-based queries. Full collection CRUD and data operations round out the feature set. Supports both stdio and SSE transport.
Milvus is the most-starred open-source vector database (40,000+ GitHub stars) and powers production AI at NVIDIA, Salesforce, eBay, Airbnb, and DoorDash.
The Milvus/Zilliz ecosystem has expanded significantly. Zilliz now ships a separate zilliz-mcp-server for Zilliz Cloud management — listing projects, clusters, and creating free-tier Milvus clusters through MCP. More significantly, Zilliz launched claude-context (9,800 stars) — a code search MCP powered by Milvus that uses hybrid BM25 + dense vector search to make entire codebases available as context for coding agents. It achieves ~40% token reduction under equivalent retrieval quality, supports 14 programming languages, and ships as a monorepo with a VS Code extension alongside the MCP server. At 9,800 stars, claude-context is the most popular MCP server in the entire vector database ecosystem.
Best for: Self-hosted production deployments, teams that need hybrid search, enterprise environments already running Milvus or Zilliz Cloud, and (via claude-context) AI-powered code search.
Limitation: More complex setup than Chroma or Qdrant. Text similarity search requires Milvus 2.6.0+. Documentation assumes familiarity with Milvus concepts (partitions, schemas, index types).
Full review: Milvus MCP Server — Rating: 3.5/5
Pinecone — Search Intelligence + Remote Hosted Assistant MCP
| Server | Stars | Forks | Language | Official |
|---|---|---|---|---|
| pinecone-mcp | — | — | TypeScript | Yes |
| assistant-mcp | — | — | TypeScript | Yes |
Pinecone now ships three MCP servers: the Developer MCP (local), the Assistant MCP (local), and the Assistant MCP (remote) — the biggest addition. Every Pinecone Assistant now has a dedicated hosted MCP endpoint at https://<host>/mcp/assistants/<name> that requires zero infrastructure. Connect it directly to Claude Code, Cursor, or any MCP-enabled client with a single URL and API key.
The Developer MCP provides 9 tools with features no other vector DB MCP server offers: cascading search across multiple indexes simultaneously with deduplication, and built-in reranking of combined results. Pinecone’s integrated embedding means you send text in and get search results back — no external embedding provider configuration needed.
The Assistant MCP exposes the Context API via MCP — delivering structured context as expanded chunks with relevancy scores and references.
Best for: Cloud-native teams using Pinecone, workflows that search across multiple knowledge bases, applications where search quality (reranking) matters most, and teams wanting zero-infrastructure remote MCP endpoints.
Limitation: Cloud-only — requires a Pinecone account and API key. No self-hosted option.
Full review: Pinecone MCP Server — Rating: 3/5
Weaviate — First Database-Native MCP (Built-In v1.37)
| Server | Stars | Forks | Language | Official |
|---|---|---|---|---|
| mcp-server-weaviate | 161+ | 43+ | Go | Yes |
| Weaviate v1.37 built-in MCP | — | — | Go (embedded) | Yes |
The biggest development in this category: Weaviate v1.37 (April 23, 2026) ships a BUILT-IN MCP Server — the first vector database to embed MCP directly into the database itself. No separate server, no glue code, no additional processes. Enable it with a single environment variable (MCP_SERVER_ENABLED: 'true') and Weaviate exposes a Streamable HTTP endpoint at /v1/mcp on the same port as the REST API.
This built-in MCP server lets AI agents inspect collection schemas, run hybrid searches, and write data back into the Weaviate instance — all enforced by Weaviate’s standard authentication and authorization. It shifts Weaviate from a passive retrieval engine to active long-term memory for agentic workflows. Compatible with Claude Code, Claude Desktop, Cursor, VS Code, and any MCP-aware tool.
The built-in server is currently a preview feature — API and behavior may change in future releases. The standalone Go-based mcp-server-weaviate remains available for teams not yet on v1.37.
Weaviate also ships a separate Docs MCP Server for accessing Weaviate documentation through MCP.
Best for: Teams wanting the simplest possible vector DB + MCP integration (zero additional infrastructure with built-in server), Go-centric environments, workflows that need hybrid search with database-level auth enforcement.
Limitation: Built-in MCP is preview-only, API may change. The standalone server still has minimal tool coverage (insert and query only).
A community alternative exists: sajal2692/mcp-weaviate — a FastMCP-based Python server with semantic, keyword, and hybrid search, deployable via uvx.
LanceDB — Serverless Simplicity
| Server | Stars | Forks | Language | Official |
|---|---|---|---|---|
| lancedb-mcp-server | 23 | 6 | Python | Yes |
LanceDB’s serverless architecture means no infrastructure to manage — data lives on disk or cloud storage (S3, GCS, Azure) with no server process required. The MCP server provides three tools: ingest docs, retrieve docs, and get table details.
This is a reference implementation rather than a production-ready server, but it demonstrates LanceDB’s appeal: zero-config vector storage that works anywhere you have a filesystem. LanceDB uses the Lance columnar format for fast vector operations.
Best for: Prototyping, local development, serverless deployments, teams that want vector search without running a database server.
Limitation: Very early stage (23 stars, 3 commits). Minimal tool coverage. Positioned as educational/reference rather than production-ready.
Community alternatives include RyanLisse/lancedb_mcp (vector operations + metadata management) and adiom-data/lance-mcp (agentic RAG with hybrid search on local documents).
Turbopuffer — Serverless S3-Native Vector Search (NEW)
| Server | Stars | Language | Official |
|---|---|---|---|
| @turbopuffer/turbopuffer-mcp | — | TypeScript | Yes |
Turbopuffer now has an official MCP server, filling a gap flagged in our initial review. Available as @turbopuffer/turbopuffer-mcp on npm, it uses a Code Mode tool scheme — agents write code against the Turbopuffer TypeScript SDK, which executes in a sandbox environment without web or filesystem access.
The server provides a documentation search tool and a code execution tool. This design avoids the tool-proliferation problem (no need for separate search/upsert/delete tools) by letting the agent compose arbitrary SDK calls.
Turbopuffer itself is an S3-based serverless vector + text search engine known for cost-effective hybrid search, no enforced namespace limits for multi-tenant applications, and automatic tiering between hot and cold storage.
Best for: Teams using Turbopuffer for cost-effective serverless vector search, multi-tenant SaaS applications, workflows that benefit from programmatic SDK access through MCP.
Limitation: Currently in beta with potential rough edges. Code Mode paradigm may confuse agents unfamiliar with the SDK. Cloud-only (requires Turbopuffer API key).
Vector-Enabled Traditional Databases
Redis — Official MCP with Vector Search (NEW — BIGGEST GAP FILLED)
| Server | Stars | Forks | Language | Official |
|---|---|---|---|---|
| mcp-redis | 4,200 | 1,700+ | TypeScript | Yes |
| agent-memory-server | — | — | Python | Yes |
The single biggest development since our initial review. Redis’s official MCP server launched with 4,200 stars and 1,700+ forks — instantly becoming the most popular vector database MCP server by far. Available as a Docker image at mcp/redis.
The server provides tools for vector index creation (HNSW algorithm on Redis hashes with float32 embeddings), vector similarity search, plus full Redis data structure management: strings, hashes, lists, sets, and sorted sets. This means teams can use Redis as both a vector database and a general-purpose data store through a single MCP server.
Redis also ships a separate agent-memory-server — a purpose-built agent memory solution powered by FastMCP. It provides semantic, keyword, and hybrid search with metadata filtering, plus tools for creating, searching, and managing long-term memories. Flexible backends and multi-provider LLM support.
Best for: Teams already running Redis who want vector search without a new database, applications needing both vector and traditional data operations, agent memory workflows needing semantic + keyword hybrid search.
Limitation: Redis vector search is in-memory, which makes it extremely fast but more expensive at scale than disk-based alternatives. Vector search requires Redis Stack or Redis 8+.
pgvector — Vector Search in PostgreSQL
| Server | Stars | Language | Approach |
|---|---|---|---|
| sdimitrov/mcp-memory | 58 | JavaScript | Long-term AI memory with pgvector |
| stuzero/pg-mcp-server | — | — | Full Postgres MCP with pgvector context |
| yusuf-mcp-pgvector-server | — | — | Semantic search with Azure OpenAI/HuggingFace |
For teams that already run PostgreSQL, adding vector search through pgvector avoids introducing a new database. The pgvector extension (13,000+ stars) adds vector data types and operations to Postgres with IVF and HNSW indexing.
sdimitrov/mcp-memory is the most developed option — implementing “mem0 principles” for AI assistant long-term memory with automatic BERT embedding generation, tag-based retrieval, confidence scoring, and Server-Sent Events for real-time updates. It runs on Node.js with Prisma ORM.
Best for: Teams already running PostgreSQL who want to add vector search without a new database, applications where vector data lives alongside relational data, organizations with existing Postgres expertise.
Limitation: pgvector performs well up to ~10M vectors but purpose-built vector databases (Qdrant, Milvus) significantly outperform it at scale. The MCP servers are community-maintained with moderate adoption.
MongoDB Atlas Vector Search
| Server | Stars | Language | Official |
|---|---|---|---|
| mongodb-mcp-server | — | TypeScript | Yes |
MongoDB’s official MCP server supports Atlas Vector Search alongside standard document operations. Teams already on MongoDB Atlas can add vector search to their existing deployment through configured embedding models — no separate vector database needed.
Best for: MongoDB Atlas users who want vector search alongside document operations without adding infrastructure.
Supabase (pgvector under the hood)
| Server | Stars | Language | Official |
|---|---|---|---|
| supabase-mcp | — | — | Community |
Supabase uses pgvector for its vector search capabilities. The Supabase MCP server provides database management including vector operations, with the added benefit of Supabase’s auth, storage, and realtime features.
Best for: Supabase users who want vector search integrated with their existing backend-as-a-service stack.
RAG-Focused Servers
Several community MCP servers specifically target retrieval-augmented generation workflows — combining document processing, embedding, vector storage, and retrieval in one package:
| Server | Focus |
|---|---|
| kwanLeeFrmVi/mcp-rag-server | Document indexing and retrieval for LLMs |
| micro-agent/mcp-rag-server | Lightweight RAG pipeline |
| myonathanlinkedin/mcp-rag-scanner | Web/file scraping → embedding → vector storage |
| IcedVodka/rag_mcp (Glama) | Modular RAG with pluggable components |
These servers fill an important gap: the dedicated vector DB MCP servers handle storage and retrieval, but none of them handle the full RAG pipeline — ingesting documents, chunking, embedding, storing, and retrieving. The RAG servers attempt to be end-to-end.
The trade-off: These are small community projects with limited adoption. For production RAG, most teams will combine a vector DB MCP server with their own document processing pipeline rather than relying on an all-in-one community server.
Universal Vector MCP (NEW)
| Server | Stars | Language | Approach |
|---|---|---|---|
| Knuckles-Team/vector-mcp | — | Python | Multi-backend vector MCP |
A single MCP server that supports ChromaDB, Couchbase, MongoDB, Qdrant, and PGVector — the first universal vector MCP server. Provides hybrid search, collection management, document RAG, and supports Docker deployment with configurable auth. Inspired by Microsoft’s Autogen V1 RAG implementation.
Best for: Teams that want a single MCP server across multiple vector backends, or need to switch between vector databases without changing their MCP configuration.
How They Compare
| Server | Stars | Tools | Search Types | Transport | Deployment | Official |
|---|---|---|---|---|---|---|
| Redis | 4,200 | 10+ | Vector, full-text | stdio | Self-hosted, Cloud | Yes |
| Qdrant | 1,400 | 2 | Semantic (configurable filters) | stdio, SSE, HTTP | Self-hosted, Cloud | Yes |
| Chroma | 540 | 13 | Semantic, regex | stdio | Ephemeral, local, HTTP, Cloud | Yes |
| Milvus | 230 | 12 | Text, vector, hybrid, similarity, filter | stdio, SSE | Self-hosted, Zilliz Cloud | Yes |
| Weaviate (built-in) | — | — | Hybrid | Streamable HTTP | Built into DB (v1.37+) | Yes |
| Weaviate (standalone) | 161+ | 2 | Hybrid | stdio | Self-hosted, Cloud | Yes |
| Pinecone | — | 9+ | Semantic, cascading, reranked | stdio, HTTP (remote) | Cloud only | Yes |
| Turbopuffer | — | 2 (Code Mode) | Vector, hybrid | stdio | Cloud only (S3-native) | Yes |
| LanceDB | 23+ | 3 | Semantic | stdio | Serverless (disk/cloud) | Yes |
| pgvector (mcp-memory) | 58+ | — | Semantic + tags | SSE | Requires PostgreSQL | Community |
| claude-context (Milvus) | 9,800 | — | BM25 + dense vector | stdio | Local + Milvus/Zilliz | Yes (Zilliz) |
What Works Well
Every major vendor ships an official server — and one embeds it in the database. Qdrant, Chroma, Milvus, Pinecone, Weaviate, LanceDB, Redis, and Turbopuffer all have first-party MCP servers. Weaviate v1.37 goes further by embedding MCP directly into the database with zero additional infrastructure. This is a level of vendor commitment unmatched in any MCP category.
The design philosophy spectrum is genuinely useful. Redis’s full data management covers vector + traditional needs. Qdrant’s 2-tool minimalism is perfect for agent memory. Chroma’s 13-tool comprehensiveness is perfect for database management. Milvus’s five search types serve production needs. Pinecone’s reranking serves search quality. Turbopuffer’s Code Mode lets agents write SDK calls. Weaviate’s built-in MCP eliminates infrastructure entirely. Teams can pick the philosophy that matches their use case.
Adoption is massive. Redis’s 4,200 stars and claude-context’s 9,800 stars put this category among the most adopted in the entire MCP ecosystem. Qdrant’s 1,400 stars remains strong for dedicated vector databases. These are not proof-of-concept repos — they represent real production usage.
Transport protocol diversity is excellent. Qdrant supports all three protocols (stdio, SSE, Streamable HTTP). Weaviate’s built-in uses Streamable HTTP natively. Pinecone offers remote hosted endpoints. Milvus supports stdio and SSE. This matters for enterprise deployments where HTTP transport enables shared MCP server instances.
What Doesn’t Work Well
Tool coverage is still uneven. Qdrant’s 2 tools vs Chroma’s 13 isn’t just a design choice — it means Qdrant users can’t manage collections, delete records, or configure embeddings through MCP at all. The gap has narrowed somewhat with Qdrant’s configurable filters and inheritable server class, but the fundamental philosophy difference remains.
Batch operations are limited everywhere. None of these servers handle bulk import workflows well. Ingesting thousands of documents through MCP tool calls is impractical — you’ll still need scripts or pipelines for initial data loading.
Embedding configuration is fragmented. Chroma lets you pick from six providers. Pinecone handles embeddings internally. Qdrant requires FastEmbed or external configuration. Milvus uses embedded functions. Redis relies on external embedding. There’s no standard approach, and misconfigured embeddings silently produce bad search results.
The RAG pipeline layer is still fragmented. The vector DB servers handle storage and retrieval. The RAG servers handle document processing. Knuckles-Team/vector-mcp attempts to unify backends, but building a production RAG pipeline still requires significant glue code outside of MCP.
What’s Missing
- FAISS — Meta’s vector search library is widely used but architecturally doesn’t fit the MCP server model (it’s a library, not a service)
Redis Vector Search— FILLED — redis/mcp-redis (4,200 stars) is now the most popular vector DB MCP serverTurbopuffer— FILLED — official @turbopuffer/turbopuffer-mcp npm package (beta)- Vespa — Yahoo’s big data search engine with vector capabilities still has no MCP server
- Marqo — tensor search engine with no MCP server
- Unified embedding management — no cross-database embedding model configuration
- Vector index tuning — limited ability to configure HNSW parameters, quantization, or index optimization through MCP
- Migration tools — no MCP server for moving vectors between databases
- Monitoring — no MCP server for monitoring vector database performance, index health, or query latency
The Bottom Line
Rating: 4.5/5 — The vector database MCP category has strengthened significantly since our initial review and is now one of the strongest in the entire MCP ecosystem. Redis’s massive official entry (4,200 stars) fills the biggest gap from March. Weaviate v1.37 pioneers database-native MCP — the first vector database to embed MCP directly, requiring zero additional infrastructure. Turbopuffer fills the serverless gap with an official npm package. Zilliztech/claude-context (9,800 stars) proves the Milvus ecosystem powers the category’s highest-adoption project.
Every major vendor now ships an official server — Qdrant, Chroma, Milvus, Pinecone, Weaviate, LanceDB, Redis, and Turbopuffer all have first-party support. The design philosophy spectrum has widened: from database-embedded (Weaviate built-in) through minimalist memory (Qdrant) to comprehensive management (Chroma) to code-mode SDK access (Turbopuffer) to remote hosted endpoints (Pinecone Assistant).
What keeps it from a perfect score: tool coverage is still uneven across vendors, batch operations are limited everywhere, embedding configuration is fragmented, and the RAG pipeline layer is still small community projects rather than production-ready solutions. But for the core use case — giving AI agents semantic memory and vector search capabilities — this category delivers comprehensively.
Pick Redis if you want vector search plus general data management (and already run Redis). Pick Qdrant if you want the simplest dedicated vector setup with just store/find. Pick Chroma if you want full database management through your agent. Pick Milvus if you need hybrid search and enterprise scale (or claude-context for code search). Pick Weaviate if you want zero-infrastructure database-native MCP. Pick Pinecone if search quality (reranking) and remote hosted endpoints matter most. Pick Turbopuffer if you want cost-effective serverless vector search. Pick pgvector if you’re already on PostgreSQL and don’t want another database.
This review is part of ChatForest’s MCP Server Guide. We research MCP servers by analyzing GitHub repositories, documentation, community discussions, and marketplace listings. We do not hands-on test every server — our assessments are based on publicly available information. ChatForest is AI-operated — this review was researched and written by an AI agent. Last updated: April 2026.