You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -320,6 +321,7 @@ NOTE: atlas tools are only available when you set credentials on [configuration]
320
321
-`collection-storage-size` - Get the size of a collection in MB
321
322
-`db-stats` - Return statistics about a MongoDB database
322
323
-`export` - Export query or aggregation results to EJSON format. Creates a uniquely named export accessible via the `exported-data` resource.
324
+
-`vector-search` - Execute a vector similarity search ($vectorSearch) over a collection. See [Vector Search & Embeddings](#vector-search--embeddings).
323
325
324
326
## 📄 Supported Resources
325
327
@@ -361,6 +363,13 @@ The MongoDB MCP Server can be configured using multiple methods, with the follow
361
363
|`exportTimeoutMs`|`MDB_MCP_EXPORT_TIMEOUT_MS`| 300000 | Time in milliseconds after which an export is considered expired and eligible for cleanup. |
362
364
|`exportCleanupIntervalMs`|`MDB_MCP_EXPORT_CLEANUP_INTERVAL_MS`| 120000 | Time in milliseconds between export cleanup cycles that remove expired export files. |
363
365
|`atlasTemporaryDatabaseUserLifetimeMs`|`MDB_MCP_ATLAS_TEMPORARY_DATABASE_USER_LIFETIME_MS`| 14400000 | Time in milliseconds that temporary database users created when connecting to MongoDB Atlas clusters will remain active before being automatically deleted. |
366
+
|`vectorSearchPath`|`MDB_MCP_VECTOR_SEARCH_PATH`| <notset> | Default vector field path used by `vector-search` (V2 mode). If set together with `vectorSearchIndex`, the V2 vector search tool variant is enabled. |
367
+
|`vectorSearchIndex`|`MDB_MCP_VECTOR_SEARCH_INDEX`| <notset> | Default vector search index name used by `vector-search` (V2 mode). Must be set with `vectorSearchPath` to enable V2 mode. |
368
+
|`embeddingModelProvider`|`MDB_MCP_EMBEDDING_MODEL_PROVIDER`| azure-ai-inference | Embedding model provider identifier. Currently only `azure-ai-inference` is supported. |
369
+
|`embeddingModelEndpoint`|`MDB_MCP_EMBEDDING_MODEL_ENDPOINT`| <notset> | Endpoint for the embedding model provider. Required for vector search. |
370
+
|`embeddingModelApikey`|`MDB_MCP_EMBEDDING_MODEL_APIKEY`| <notset> | API key/credential for the embedding model provider. Required for vector search. |
371
+
|`embeddingModelDeploymentName`|`MDB_MCP_EMBEDDING_MODEL_DEPLOYMENT_NAME`| <notset> | Deployment/model name to use when requesting embeddings. Required for vector search. |
@@ -482,6 +491,140 @@ You can disable telemetry using:
482
491
483
492
> **💡 Platform Note:** For Windows users, see [Environment Variables](#environment-variables) for platform-specific instructions.
484
493
494
+
### Vector Search and Embeddings
495
+
496
+
The `vector-search` tool lets you run semantic similarity queries against a MongoDB collection using the `$vectorSearch` aggregation stage. This capability is disabled unless a valid embedding configuration is supplied (see below).
497
+
498
+
#### Overview
499
+
500
+
Two internal variants of the `vector-search` tool may register depending on configuration:
501
+
502
+
1. V1 (argument-driven): You supply `path` and optionally `index` as tool arguments each call.
503
+
2. V2 (config-driven): You preconfigure both `vectorSearchPath` and `vectorSearchIndex` in server config; the tool omits those arguments and always searches that path/index.
504
+
505
+
Variant selection rules:
506
+
507
+
- If BOTH `MDB_MCP_VECTOR_SEARCH_PATH` and `MDB_MCP_VECTOR_SEARCH_INDEX` are set at startup → V2 registers.
508
+
- If NEITHER (or only one) of those is set → V1 registers, and you must provide a `path` argument per invocation (and may provide `index`).
509
+
- If embedding config is incomplete, the tool is not registered (you will see a warning in logs).
510
+
511
+
#### Required MongoDB Setup
512
+
513
+
1. A collection with a vector field (array of float/number values) containing stored embeddings.
514
+
2. A vector search index created on that field (e.g. Atlas Search vector index) when you want to leverage indexing for performance/recall.
515
+
516
+
#### Embedding Configuration (Required)
517
+
518
+
You must configure an embedding provider so the server can transform the `queryText` you pass in into a numeric embedding vector. Current provider support:
519
+
520
+
-`azure-ai-inference` (default if none specified)
521
+
522
+
Set the following environment variables (or CLI args) for Azure AI Inference:
To eliminate passing `path` (and optionally `index`) each call, set both:
537
+
538
+
```bash
539
+
export MDB_MCP_VECTOR_SEARCH_PATH="embedding"# e.g. field path storing embeddings
540
+
export MDB_MCP_VECTOR_SEARCH_INDEX="myVectorIndex"# name of the Atlas Search vector index
541
+
```
542
+
543
+
If both are present at startup, the V2 variant is loaded and you no longer pass `path`/`index` arguments at call time. Remove one or both to revert to V1.
544
+
545
+
#### Usage Examples
546
+
547
+
##### Example 1: V1 Variant (no defaults configured)
548
+
549
+
Tool invocation arguments:
550
+
551
+
```json
552
+
{
553
+
"name": "vector-search",
554
+
"arguments": {
555
+
"database": "mydb",
556
+
"collection": "articles",
557
+
"queryText": "vector databases for personalization",
558
+
"path": "embedding",
559
+
"limit": 5,
560
+
"numCandidates": 200,
561
+
"includeVector": false
562
+
}
563
+
}
564
+
```
565
+
566
+
##### Example 2: V2 Variant (defaults configured)
567
+
568
+
With `MDB_MCP_VECTOR_SEARCH_PATH=embedding` and `MDB_MCP_VECTOR_SEARCH_INDEX=myVectorIndex` set at startup:
569
+
570
+
```json
571
+
{
572
+
"name": "vector-search",
573
+
"arguments": {
574
+
"database": "mydb",
575
+
"collection": "articles",
576
+
"queryText": "vector databases for personalization",
577
+
"limit": 5,
578
+
"numCandidates": 200
579
+
}
580
+
}
581
+
```
582
+
583
+
#### Returned Data
584
+
585
+
The tool returns an array of matched documents. By default the raw embedding field is excluded (set `includeVector: true` if you need it). Standard result size safeguards (`maxDocumentsPerQuery`, `maxBytesPerQuery`) still apply.
586
+
587
+
#### Adding a Custom Embedding Provider
588
+
589
+
You can extend the server to support additional embedding services (e.g. OpenAI, Hugging Face, Vertex AI) by implementing the `EmbeddingProvider` interface:
590
+
591
+
`src/embedding/embeddingProvider.ts`:
592
+
593
+
```ts
594
+
exportinterfaceEmbeddingProvider {
595
+
name:string;
596
+
embed(input:string[]):Promise<number[][]>;
597
+
}
598
+
```
599
+
600
+
Steps:
601
+
602
+
1. Create a new file under `src/embedding/`, e.g. `myProviderEmbeddingProvider.ts`, implementing the interface.
603
+
2. Add a new case in `EmbeddingProviderFactory.create()` & `isEmbeddingConfigValid()` matching a unique `embeddingModelProvider` string (e.g. `my-provider`).
604
+
3. Document required env vars (e.g. `MDB_MCP_EMBEDDING_MODEL_ENDPOINT`, `MDB_MCP_EMBEDDING_MODEL_APIKEY`, etc. or new ones) and update README.
605
+
4. (Optional) Support provider‑specific validation (dimension, model name) in `assertEmbeddingConfigValid`.
606
+
5. Provide tests (unit + integration if vector search depends on it) ensuring your provider returns deterministic dimensionality.
607
+
608
+
After adding your provider, users enable it by setting:
# plus any provider-specific variables you defined
613
+
```
614
+
615
+
If your provider requires different variable names, follow the existing naming convention: prefix with `MDB_MCP_` and document them.
616
+
617
+
#### Troubleshooting
618
+
619
+
| Symptom | Likely Cause | Action |
620
+
| ------- | ------------ | ------ |
621
+
|`vector-search` tool missing | Incomplete embedding config | Set endpoint, api key, deployment name env vars. Restart client. |
622
+
| Error: "Embedding provider returned empty embedding" | Provider/network issue | Check credentials & network; verify model supports embeddings. |
623
+
| Error requiring 'path' even though I set env vars | Only one of PATH/INDEX set | Set BOTH `MDB_MCP_VECTOR_SEARCH_PATH` and `MDB_MCP_VECTOR_SEARCH_INDEX` or remove both. |
624
+
| High latency | Large `numCandidates` or remote model slowness | Lower `numCandidates`; verify model region proximity. |
625
+
626
+
---
627
+
485
628
### Atlas API Access
486
629
487
630
To use the Atlas API tools, you'll need to create a service account in MongoDB Atlas:
@@ -680,6 +823,6 @@ connecting to the Atlas API, your MongoDB Cluster, or any other external calls
680
823
to third-party services like OID Providers. The behaviour is the same as what
681
824
`mongosh` does, so the same settings will work in the MCP Server.
682
825
683
-
## 🤝Contributing
826
+
## Contributing
684
827
685
828
Interested in contributing? Great! Please check our [Contributing Guide](CONTRIBUTING.md) for guidelines on code contributions, standards, adding new tools, and troubleshooting information.
0 commit comments