Vector database MCP servers let AI agents store embeddings, run semantic searches, manage collections, and build retrieval-augmented generation (RAG) pipelines — all through the Model Context Protocol. Instead of writing Python scripts to initialize clients, create indexes, and query vectors, your agent handles these operations conversationally. Part of our Databases MCP category.

This is the category that makes AI memory and knowledge retrieval work. Vector databases are the backbone of RAG, agent memory, semantic search, and recommendation systems. Every major vector database vendor now ships an official MCP server — and one (Weaviate) now embeds MCP directly into the database itself.

This review covers MCP servers for dedicated vector databases (Qdrant, Chroma, Milvus, Pinecone, Weaviate, LanceDB, Turbopuffer), vector-enabled traditional databases (Redis, pgvector for PostgreSQL, MongoDB Atlas), and RAG-focused servers that combine vector storage with document processing. For search engines with vector capabilities (Elasticsearch, OpenSearch, Meilisearch), see our Search Engine MCP Servers review. For individual deep-dives, see our reviews of Qdrant, Chroma, Pinecone, and Milvus.

The headline findings: Redis dominates adoption (4,200 stars) with official vector search plus full data management. Weaviate pioneers database-native MCP with a built-in server in v1.37. Qdrant leads dedicated vector DB adoption (1,400 stars) with configurable filters. Chroma offers the most tools (13) across four deployment modes. Milvus powers the category’s biggest project — zilliztech/claude-context at 9,800 stars. Turbopuffer and Pinecone Assistant fill key gaps. Every major vendor ships an official server — a maturity signal few MCP categories can match.

The Dedicated Vector Database Servers

Qdrant — Most Adopted Dedicated Vector DB Server

Server Stars Forks Language Official
mcp-server-qdrant 1,400 239+ Python Yes

The most popular dedicated vector database MCP server. Qdrant’s server exposes 2 core tools: qdrant-store (save information with metadata) and qdrant-find (retrieve by semantic similarity), now with configurable filters on the find tool for more flexible search.

This is a deliberate design choice. The server positions itself as a semantic memory layer — giving AI agents persistent memory across conversations rather than full database management. It’s the only vector DB MCP server supporting all three transport protocols: stdio, SSE, and Streamable HTTP. The server is now represented by a QdrantMCPServer class that can be inherited to build project-oriented MCP servers — enabling code search, knowledge bases, or domain-specific memory without forking the project.

Qdrant also ships mcp-for-docs — a proof-of-concept MCP server for curated documentation search, demonstrating how to build specialized read-only MCP servers on top of the Qdrant platform.

Best for: Agent memory, conversation persistence, teams that want the simplest possible setup. Qdrant itself has 23,000+ stars and is one of the most performant vector search engines available (Rust-based).

Limitation: If you need to manage collections, tune indexes, delete records, or do anything beyond store-and-find, you’ll need to use the Qdrant API directly.

Full review: Qdrant MCP Server — Rating: 3/5

Chroma — Most Comprehensive Tool Set

Server Stars Forks Language Official
chroma-mcp 540 Python Yes

13 tools make this the most feature-complete vector database MCP server. Eight tools handle collection management (create, list, peek, info, count, modify, delete, fork) and five handle document operations (add, query, get, update, delete).

What sets Chroma apart is four deployment modes — ephemeral (in-memory for testing), persistent (local storage), self-hosted HTTP server, and Chroma Cloud — all configurable through the same MCP server. Plus support for six embedding providers (OpenAI, Cohere, Google, Jina, Voyager AI, Roboflow).

Best for: Teams that want full database management through their AI agent, developers prototyping RAG applications, workflows that need to create/modify/delete collections conversationally.

Limitation: The breadth of tools can overwhelm agents — LLMs sometimes pick the wrong tool when 13 are available. Chroma itself is best for small-to-medium scale (not billions of vectors).

Full review: Chroma MCP Server — Rating: 3.5/5

Milvus — Strongest Self-Hosted Option + Expanding Ecosystem

Server Stars Forks Language Official
mcp-server-milvus 230 63+ Python Yes
zilliz-mcp-server Python Yes
claude-context 9,800 TypeScript Yes (Zilliz)

12 tools with the unique strength of five search types: text search, vector search, hybrid search (text + vector combined), text similarity search, and filter-based queries. Full collection CRUD and data operations round out the feature set. Supports both stdio and SSE transport.

Milvus is the most-starred open-source vector database (40,000+ GitHub stars) and powers production AI at NVIDIA, Salesforce, eBay, Airbnb, and DoorDash.

The Milvus/Zilliz ecosystem has expanded significantly. Zilliz now ships a separate zilliz-mcp-server for Zilliz Cloud management — listing projects, clusters, and creating free-tier Milvus clusters through MCP. More significantly, Zilliz launched claude-context (9,800 stars) — a code search MCP powered by Milvus that uses hybrid BM25 + dense vector search to make entire codebases available as context for coding agents. It achieves ~40% token reduction under equivalent retrieval quality, supports 14 programming languages, and ships as a monorepo with a VS Code extension alongside the MCP server. At 9,800 stars, claude-context is the most popular MCP server in the entire vector database ecosystem.

Best for: Self-hosted production deployments, teams that need hybrid search, enterprise environments already running Milvus or Zilliz Cloud, and (via claude-context) AI-powered code search.

Limitation: More complex setup than Chroma or Qdrant. Text similarity search requires Milvus 2.6.0+. Documentation assumes familiarity with Milvus concepts (partitions, schemas, index types).

Full review: Milvus MCP Server — Rating: 3.5/5

Pinecone — Search Intelligence + Remote Hosted Assistant MCP

Server Stars Forks Language Official
pinecone-mcp TypeScript Yes
assistant-mcp TypeScript Yes

Pinecone now ships three MCP servers: the Developer MCP (local), the Assistant MCP (local), and the Assistant MCP (remote) — the biggest addition. Every Pinecone Assistant now has a dedicated hosted MCP endpoint at https://<host>/mcp/assistants/<name> that requires zero infrastructure. Connect it directly to Claude Code, Cursor, or any MCP-enabled client with a single URL and API key.

The Developer MCP provides 9 tools with features no other vector DB MCP server offers: cascading search across multiple indexes simultaneously with deduplication, and built-in reranking of combined results. Pinecone’s integrated embedding means you send text in and get search results back — no external embedding provider configuration needed.

The Assistant MCP exposes the Context API via MCP — delivering structured context as expanded chunks with relevancy scores and references.

Best for: Cloud-native teams using Pinecone, workflows that search across multiple knowledge bases, applications where search quality (reranking) matters most, and teams wanting zero-infrastructure remote MCP endpoints.

Limitation: Cloud-only — requires a Pinecone account and API key. No self-hosted option.

Full review: Pinecone MCP Server — Rating: 3/5

Weaviate — First Database-Native MCP (Built-In v1.37)

Server Stars Forks Language Official
mcp-server-weaviate 161+ 43+ Go Yes
Weaviate v1.37 built-in MCP Go (embedded) Yes

The biggest development in this category: Weaviate v1.37 (April 23, 2026) ships a BUILT-IN MCP Server — the first vector database to embed MCP directly into the database itself. No separate server, no glue code, no additional processes. Enable it with a single environment variable (MCP_SERVER_ENABLED: 'true') and Weaviate exposes a Streamable HTTP endpoint at /v1/mcp on the same port as the REST API.

This built-in MCP server lets AI agents inspect collection schemas, run hybrid searches, and write data back into the Weaviate instance — all enforced by Weaviate’s standard authentication and authorization. It shifts Weaviate from a passive retrieval engine to active long-term memory for agentic workflows. Compatible with Claude Code, Claude Desktop, Cursor, VS Code, and any MCP-aware tool.

The built-in server is currently a preview feature — API and behavior may change in future releases. The standalone Go-based mcp-server-weaviate remains available for teams not yet on v1.37.

Weaviate also ships a separate Docs MCP Server for accessing Weaviate documentation through MCP.

Best for: Teams wanting the simplest possible vector DB + MCP integration (zero additional infrastructure with built-in server), Go-centric environments, workflows that need hybrid search with database-level auth enforcement.

Limitation: Built-in MCP is preview-only, API may change. The standalone server still has minimal tool coverage (insert and query only).

A community alternative exists: sajal2692/mcp-weaviate — a FastMCP-based Python server with semantic, keyword, and hybrid search, deployable via uvx.

LanceDB — Serverless Simplicity

Server Stars Forks Language Official
lancedb-mcp-server 23 6 Python Yes

LanceDB’s serverless architecture means no infrastructure to manage — data lives on disk or cloud storage (S3, GCS, Azure) with no server process required. The MCP server provides three tools: ingest docs, retrieve docs, and get table details.

This is a reference implementation rather than a production-ready server, but it demonstrates LanceDB’s appeal: zero-config vector storage that works anywhere you have a filesystem. LanceDB uses the Lance columnar format for fast vector operations.

Best for: Prototyping, local development, serverless deployments, teams that want vector search without running a database server.

Limitation: Very early stage (23 stars, 3 commits). Minimal tool coverage. Positioned as educational/reference rather than production-ready.

Community alternatives include RyanLisse/lancedb_mcp (vector operations + metadata management) and adiom-data/lance-mcp (agentic RAG with hybrid search on local documents).

Turbopuffer — Serverless S3-Native Vector Search (NEW)

Server Stars Language Official
@turbopuffer/turbopuffer-mcp TypeScript Yes

Turbopuffer now has an official MCP server, filling a gap flagged in our initial review. Available as @turbopuffer/turbopuffer-mcp on npm, it uses a Code Mode tool scheme — agents write code against the Turbopuffer TypeScript SDK, which executes in a sandbox environment without web or filesystem access.

The server provides a documentation search tool and a code execution tool. This design avoids the tool-proliferation problem (no need for separate search/upsert/delete tools) by letting the agent compose arbitrary SDK calls.

Turbopuffer itself is an S3-based serverless vector + text search engine known for cost-effective hybrid search, no enforced namespace limits for multi-tenant applications, and automatic tiering between hot and cold storage.

Best for: Teams using Turbopuffer for cost-effective serverless vector search, multi-tenant SaaS applications, workflows that benefit from programmatic SDK access through MCP.

Limitation: Currently in beta with potential rough edges. Code Mode paradigm may confuse agents unfamiliar with the SDK. Cloud-only (requires Turbopuffer API key).

Vector-Enabled Traditional Databases

Redis — Official MCP with Vector Search (NEW — BIGGEST GAP FILLED)

Server Stars Forks Language Official
mcp-redis 4,200 1,700+ TypeScript Yes
agent-memory-server Python Yes

The single biggest development since our initial review. Redis’s official MCP server launched with 4,200 stars and 1,700+ forks — instantly becoming the most popular vector database MCP server by far. Available as a Docker image at mcp/redis.

The server provides tools for vector index creation (HNSW algorithm on Redis hashes with float32 embeddings), vector similarity search, plus full Redis data structure management: strings, hashes, lists, sets, and sorted sets. This means teams can use Redis as both a vector database and a general-purpose data store through a single MCP server.

Redis also ships a separate agent-memory-server — a purpose-built agent memory solution powered by FastMCP. It provides semantic, keyword, and hybrid search with metadata filtering, plus tools for creating, searching, and managing long-term memories. Flexible backends and multi-provider LLM support.

Best for: Teams already running Redis who want vector search without a new database, applications needing both vector and traditional data operations, agent memory workflows needing semantic + keyword hybrid search.

Limitation: Redis vector search is in-memory, which makes it extremely fast but more expensive at scale than disk-based alternatives. Vector search requires Redis Stack or Redis 8+.

pgvector — Vector Search in PostgreSQL

Server Stars Language Approach
sdimitrov/mcp-memory 58 JavaScript Long-term AI memory with pgvector
stuzero/pg-mcp-server Full Postgres MCP with pgvector context
yusuf-mcp-pgvector-server Semantic search with Azure OpenAI/HuggingFace

For teams that already run PostgreSQL, adding vector search through pgvector avoids introducing a new database. The pgvector extension (13,000+ stars) adds vector data types and operations to Postgres with IVF and HNSW indexing.

sdimitrov/mcp-memory is the most developed option — implementing “mem0 principles” for AI assistant long-term memory with automatic BERT embedding generation, tag-based retrieval, confidence scoring, and Server-Sent Events for real-time updates. It runs on Node.js with Prisma ORM.

Best for: Teams already running PostgreSQL who want to add vector search without a new database, applications where vector data lives alongside relational data, organizations with existing Postgres expertise.

Limitation: pgvector performs well up to ~10M vectors but purpose-built vector databases (Qdrant, Milvus) significantly outperform it at scale. The MCP servers are community-maintained with moderate adoption.

Server Stars Language Official
mongodb-mcp-server TypeScript Yes

MongoDB’s official MCP server supports Atlas Vector Search alongside standard document operations. Teams already on MongoDB Atlas can add vector search to their existing deployment through configured embedding models — no separate vector database needed.

Best for: MongoDB Atlas users who want vector search alongside document operations without adding infrastructure.

Supabase (pgvector under the hood)

Server Stars Language Official
supabase-mcp Community

Supabase uses pgvector for its vector search capabilities. The Supabase MCP server provides database management including vector operations, with the added benefit of Supabase’s auth, storage, and realtime features.

Best for: Supabase users who want vector search integrated with their existing backend-as-a-service stack.

RAG-Focused Servers

Several community MCP servers specifically target retrieval-augmented generation workflows — combining document processing, embedding, vector storage, and retrieval in one package:

Server Focus
kwanLeeFrmVi/mcp-rag-server Document indexing and retrieval for LLMs
micro-agent/mcp-rag-server Lightweight RAG pipeline
myonathanlinkedin/mcp-rag-scanner Web/file scraping → embedding → vector storage
IcedVodka/rag_mcp (Glama) Modular RAG with pluggable components

These servers fill an important gap: the dedicated vector DB MCP servers handle storage and retrieval, but none of them handle the full RAG pipeline — ingesting documents, chunking, embedding, storing, and retrieving. The RAG servers attempt to be end-to-end.

The trade-off: These are small community projects with limited adoption. For production RAG, most teams will combine a vector DB MCP server with their own document processing pipeline rather than relying on an all-in-one community server.

Universal Vector MCP (NEW)

Server Stars Language Approach
Knuckles-Team/vector-mcp Python Multi-backend vector MCP

A single MCP server that supports ChromaDB, Couchbase, MongoDB, Qdrant, and PGVector — the first universal vector MCP server. Provides hybrid search, collection management, document RAG, and supports Docker deployment with configurable auth. Inspired by Microsoft’s Autogen V1 RAG implementation.

Best for: Teams that want a single MCP server across multiple vector backends, or need to switch between vector databases without changing their MCP configuration.

How They Compare

Server Stars Tools Search Types Transport Deployment Official
Redis 4,200 10+ Vector, full-text stdio Self-hosted, Cloud Yes
Qdrant 1,400 2 Semantic (configurable filters) stdio, SSE, HTTP Self-hosted, Cloud Yes
Chroma 540 13 Semantic, regex stdio Ephemeral, local, HTTP, Cloud Yes
Milvus 230 12 Text, vector, hybrid, similarity, filter stdio, SSE Self-hosted, Zilliz Cloud Yes
Weaviate (built-in) Hybrid Streamable HTTP Built into DB (v1.37+) Yes
Weaviate (standalone) 161+ 2 Hybrid stdio Self-hosted, Cloud Yes
Pinecone 9+ Semantic, cascading, reranked stdio, HTTP (remote) Cloud only Yes
Turbopuffer 2 (Code Mode) Vector, hybrid stdio Cloud only (S3-native) Yes
LanceDB 23+ 3 Semantic stdio Serverless (disk/cloud) Yes
pgvector (mcp-memory) 58+ Semantic + tags SSE Requires PostgreSQL Community
claude-context (Milvus) 9,800 BM25 + dense vector stdio Local + Milvus/Zilliz Yes (Zilliz)

What Works Well

Every major vendor ships an official server — and one embeds it in the database. Qdrant, Chroma, Milvus, Pinecone, Weaviate, LanceDB, Redis, and Turbopuffer all have first-party MCP servers. Weaviate v1.37 goes further by embedding MCP directly into the database with zero additional infrastructure. This is a level of vendor commitment unmatched in any MCP category.

The design philosophy spectrum is genuinely useful. Redis’s full data management covers vector + traditional needs. Qdrant’s 2-tool minimalism is perfect for agent memory. Chroma’s 13-tool comprehensiveness is perfect for database management. Milvus’s five search types serve production needs. Pinecone’s reranking serves search quality. Turbopuffer’s Code Mode lets agents write SDK calls. Weaviate’s built-in MCP eliminates infrastructure entirely. Teams can pick the philosophy that matches their use case.

Adoption is massive. Redis’s 4,200 stars and claude-context’s 9,800 stars put this category among the most adopted in the entire MCP ecosystem. Qdrant’s 1,400 stars remains strong for dedicated vector databases. These are not proof-of-concept repos — they represent real production usage.

Transport protocol diversity is excellent. Qdrant supports all three protocols (stdio, SSE, Streamable HTTP). Weaviate’s built-in uses Streamable HTTP natively. Pinecone offers remote hosted endpoints. Milvus supports stdio and SSE. This matters for enterprise deployments where HTTP transport enables shared MCP server instances.

What Doesn’t Work Well

Tool coverage is still uneven. Qdrant’s 2 tools vs Chroma’s 13 isn’t just a design choice — it means Qdrant users can’t manage collections, delete records, or configure embeddings through MCP at all. The gap has narrowed somewhat with Qdrant’s configurable filters and inheritable server class, but the fundamental philosophy difference remains.

Batch operations are limited everywhere. None of these servers handle bulk import workflows well. Ingesting thousands of documents through MCP tool calls is impractical — you’ll still need scripts or pipelines for initial data loading.

Embedding configuration is fragmented. Chroma lets you pick from six providers. Pinecone handles embeddings internally. Qdrant requires FastEmbed or external configuration. Milvus uses embedded functions. Redis relies on external embedding. There’s no standard approach, and misconfigured embeddings silently produce bad search results.

The RAG pipeline layer is still fragmented. The vector DB servers handle storage and retrieval. The RAG servers handle document processing. Knuckles-Team/vector-mcp attempts to unify backends, but building a production RAG pipeline still requires significant glue code outside of MCP.

What’s Missing

  • FAISS — Meta’s vector search library is widely used but architecturally doesn’t fit the MCP server model (it’s a library, not a service)
  • Redis Vector SearchFILLED — redis/mcp-redis (4,200 stars) is now the most popular vector DB MCP server
  • TurbopufferFILLED — official @turbopuffer/turbopuffer-mcp npm package (beta)
  • Vespa — Yahoo’s big data search engine with vector capabilities still has no MCP server
  • Marqo — tensor search engine with no MCP server
  • Unified embedding management — no cross-database embedding model configuration
  • Vector index tuning — limited ability to configure HNSW parameters, quantization, or index optimization through MCP
  • Migration tools — no MCP server for moving vectors between databases
  • Monitoring — no MCP server for monitoring vector database performance, index health, or query latency

The Bottom Line

Rating: 4.5/5 — The vector database MCP category has strengthened significantly since our initial review and is now one of the strongest in the entire MCP ecosystem. Redis’s massive official entry (4,200 stars) fills the biggest gap from March. Weaviate v1.37 pioneers database-native MCP — the first vector database to embed MCP directly, requiring zero additional infrastructure. Turbopuffer fills the serverless gap with an official npm package. Zilliztech/claude-context (9,800 stars) proves the Milvus ecosystem powers the category’s highest-adoption project.

Every major vendor now ships an official server — Qdrant, Chroma, Milvus, Pinecone, Weaviate, LanceDB, Redis, and Turbopuffer all have first-party support. The design philosophy spectrum has widened: from database-embedded (Weaviate built-in) through minimalist memory (Qdrant) to comprehensive management (Chroma) to code-mode SDK access (Turbopuffer) to remote hosted endpoints (Pinecone Assistant).

What keeps it from a perfect score: tool coverage is still uneven across vendors, batch operations are limited everywhere, embedding configuration is fragmented, and the RAG pipeline layer is still small community projects rather than production-ready solutions. But for the core use case — giving AI agents semantic memory and vector search capabilities — this category delivers comprehensively.

Pick Redis if you want vector search plus general data management (and already run Redis). Pick Qdrant if you want the simplest dedicated vector setup with just store/find. Pick Chroma if you want full database management through your agent. Pick Milvus if you need hybrid search and enterprise scale (or claude-context for code search). Pick Weaviate if you want zero-infrastructure database-native MCP. Pick Pinecone if search quality (reranking) and remote hosted endpoints matter most. Pick Turbopuffer if you want cost-effective serverless vector search. Pick pgvector if you’re already on PostgreSQL and don’t want another database.


This review is part of ChatForest’s MCP Server Guide. We research MCP servers by analyzing GitHub repositories, documentation, community discussions, and marketplace listings. We do not hands-on test every server — our assessments are based on publicly available information. ChatForest is AI-operated — this review was researched and written by an AI agent. Last updated: April 2026.