Add MCP optimizer implementation for semantic tool discovery#3440
Add MCP optimizer implementation for semantic tool discovery#3440
Conversation
fac6152 to
d809cfa
Compare
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
a5aa704 to
9e28406
Compare
8d707ff to
5c0713a
Compare
5c0713a to
16bbbfc
Compare
Large PR justification has been provided. Thank you!
|
✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3440 +/- ##
==========================================
- Coverage 65.15% 64.54% -0.62%
==========================================
Files 398 413 +15
Lines 38821 40431 +1610
==========================================
+ Hits 25295 26097 +802
- Misses 11564 12294 +730
- Partials 1962 2040 +78 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Add nil receiver checks to IngestInitialBackends, OnRegisterSession, and Close methods to prevent panics when called on nil *EmbeddingOptimizer. The tests explicitly test nil integration handling, so these methods must safely handle nil receivers. Fixes: - TestClose_NilIntegration panic - TestIngestInitialBackends_NilIntegration panic - TestOnRegisterSession_NilIntegration panic - All related optimizer unit test failures
- Fix line length violations (lll) by wrapping long lines - Remove unused processedSessions field from EmbeddingOptimizer - Remove unused sync import - Change unused receivers to _ in convertSearchResults and resolveToolTarget - Rename unused ctx parameter to _ in NewEmbeddingOptimizer - Remove unused deserializeServerMetadata, update, and delete functions - Simplify createTestDatabase to return only Database (not unused embeddingFunc) - Add nolint directive for OptimizerIntegration type alias (kept for test compatibility) Fixes all golangci-lint errors: - lll: 2 line length violations - revive: 4 unused parameter/receiver issues - unparam: 1 unused return value - unused: 4 unused functions/fields
#3471 Signed-off-by: nigel brown <nigel@stacklok.com>
|
Retested. We definitely need #3471 to stop a race condidtion. |
The serviceVersion field in telemetry config is documented as optional with a default of the ToolHive version, but the code was passing empty strings to WithServiceVersion() which requires a non-empty value. This fix applies the default value (from versions.GetVersionInfo().Version) when serviceVersion is omitted, making it truly optional as documented. Fixes error: "service version cannot be empty" when telemetry is enabled without an explicit serviceVersion in the config. Bug was introduced in commit 64eb12e (PR #3207).
|
I've split PR #3440 into three smaller PRs: Created PRs
What's in Each PRPR 1 (Non-functional): ~10,400 lines
PR 2 (Integration): ~630 lines added, ~770 removed
PR 3 (Documentation): ~385 lines
Merge Order
You will also need #3471 to handle a race condition if this has not been addressed elsewhere. Note: There's a known merge conflict when combining PR1 and PR2 - PR1 modifies |
Add MCP Optimizer Implementation for Semantic Tool Discovery
This PR adds the complete MCP optimizer implementation to vMCP, enabling semantic tool discovery and reducing token usage for LLMs working with large toolsets.
Overview
The optimizer allows vMCP to expose
optim.find_toolandoptim.call_tooloperations instead of all backend tools directly. This reduces token usage by allowing LLMs to discover relevant tools on demand via semantic search rather than receiving all tool definitions upfront.Features
Core Optimizer Package
Semantic Tool Search (
pkg/optimizer/)Token Counting (
pkg/optimizer/tokens/)Database Layer (
pkg/optimizer/db/)Ingestion Service (
pkg/optimizer/ingestion/)vMCP Integration
Optimizer Endpoints (
pkg/vmcp/optimizer/)optim.find_tool: Semantic and string-based tool discoveryoptim.call_tool: Tool invocation with automatic routingServer Integration (
pkg/vmcp/server/)Router Updates (
pkg/vmcp/router/)optim_*prefixed toolsKubernetes Operator Support
Service Resolution (
cmd/thv-operator/pkg/vmcpconfig/converter.go)embeddingService→embeddingURLconversionCRD Schema (
deploy/charts/operator-crds/)Configuration
OptimizerConfig (
pkg/vmcp/config/config.go)enabled: Enable/disable optimizerembeddingBackend: Choose embedding providerembeddingURL: Embedding service URLembeddingModel: Model name for embeddingsembeddingDimension: Vector dimensionpersistPath: Optional persistence pathftsDBPath: FTS5 database pathhybridSearchRatio: Semantic vs BM25 mix (0-100%)embeddingService: Kubernetes service name (K8s only)CLI Integration (
cmd/vmcp/app/commands.go)Build System
Build Tags (
Taskfile.yml)-tags="fts5"build flag for SQLite FTS5 supportTest Task (
Taskfile.yml)test-optimizertask for optimizer integration testsExamples & Scripts
Example Configuration (
examples/vmcp-config-optimizer.yaml)Helper Scripts (
scripts/)test-optimizer-with-sqlite-vec.sh: Integration testinginspect-optimizer-db.sh: Database inspectionquery-optimizer-db.sh: Query testingDocumentation
pkg/optimizer/README.mdpkg/optimizer/INTEGRATION.mdTesting
Dependencies
chromem-go: Vector database for embeddingssqlite-vec: SQLite extension for vector similarity searchgo.uber.org/mock: Mock generation for testsBuild Requirements
-tags="fts5"build flag for FTS5 supportRelated
Large PR Justification