Skip to content

Add MCP optimizer implementation for semantic tool discovery#3440

Open
therealnb wants to merge 72 commits intomainfrom
optimizer-implementation
Open

Add MCP optimizer implementation for semantic tool discovery#3440
therealnb wants to merge 72 commits intomainfrom
optimizer-implementation

Conversation

@therealnb
Copy link
Contributor

@therealnb therealnb commented Jan 26, 2026

Add MCP Optimizer Implementation for Semantic Tool Discovery

This PR adds the complete MCP optimizer implementation to vMCP, enabling semantic tool discovery and reducing token usage for LLMs working with large toolsets.

Overview

The optimizer allows vMCP to expose optim.find_tool and optim.call_tool operations instead of all backend tools directly. This reduces token usage by allowing LLMs to discover relevant tools on demand via semantic search rather than receiving all tool definitions upfront.

Features

Core Optimizer Package

Semantic Tool Search (pkg/optimizer/)

  • Vector embeddings (384-dim) for semantic similarity search
  • Full-text search via SQLite FTS5 for BM25 text matching
  • Hybrid search combining semantic and BM25 results (configurable ratio)
  • Multiple embedding backends:
    • Ollama (local HTTP API)
    • OpenAI-compatible (vLLM, OpenAI, etc.)
    • Placeholder (deterministic hash-based, for testing)

Token Counting (pkg/optimizer/tokens/)

  • LLM cost estimation based on token counts
  • Supports monitoring token usage and optimization effectiveness

Database Layer (pkg/optimizer/db/)

  • SQLite-based storage with sqlite-vec for vector similarity search
  • FTS5 for full-text search
  • Hybrid search implementation combining both approaches
  • Persistent and in-memory storage options

Ingestion Service (pkg/optimizer/ingestion/)

  • Ingests tools from all backends in the group
  • Generates embeddings for tool metadata
  • Maintains searchable index of all available tools

vMCP Integration

Optimizer Endpoints (pkg/vmcp/optimizer/)

  • optim.find_tool: Semantic and string-based tool discovery
  • optim.call_tool: Tool invocation with automatic routing
  • Integration with vMCP server lifecycle
  • Comprehensive test coverage (unit, integration, semantic search, string matching)

Server Integration (pkg/vmcp/server/)

  • Optimizer initialization and lifecycle management
  • Session-based optimizer instances
  • Integration with discovery manager and backend registry

Router Updates (pkg/vmcp/router/)

  • Special handling for optim_* prefixed tools
  • Prevents routing optimizer tools to backends (handled by vMCP itself)

Kubernetes Operator Support

Service Resolution (cmd/thv-operator/pkg/vmcpconfig/converter.go)

  • Resolves Kubernetes Service names to URLs for embedding services
  • Handles embeddingServiceembeddingURL conversion
  • Supports in-cluster deployments

CRD Schema (deploy/charts/operator-crds/)

  • Complete optimizer configuration schema
  • Supports all optimizer features (embeddings, persistence, hybrid search)
  • Documentation updates

Configuration

OptimizerConfig (pkg/vmcp/config/config.go)

  • Comprehensive configuration options:
    • enabled: Enable/disable optimizer
    • embeddingBackend: Choose embedding provider
    • embeddingURL: Embedding service URL
    • embeddingModel: Model name for embeddings
    • embeddingDimension: Vector dimension
    • persistPath: Optional persistence path
    • ftsDBPath: FTS5 database path
    • hybridSearchRatio: Semantic vs BM25 mix (0-100%)
    • embeddingService: Kubernetes service name (K8s only)

CLI Integration (cmd/vmcp/app/commands.go)

  • Optimizer configuration parsing from YAML
  • Runtime configuration and initialization
  • Logging and status reporting

Build System

Build Tags (Taskfile.yml)

  • Added -tags="fts5" build flag for SQLite FTS5 support
  • Required for optimizer functionality
  • Applied to all vmcp builds (build, install)

Test Task (Taskfile.yml)

  • Added test-optimizer task for optimizer integration tests
  • Uses sqlite-vec for vector search testing

Examples & Scripts

Example Configuration (examples/vmcp-config-optimizer.yaml)

  • Complete example showing optimizer configuration
  • Demonstrates all configuration options

Helper Scripts (scripts/)

  • test-optimizer-with-sqlite-vec.sh: Integration testing
  • inspect-optimizer-db.sh: Database inspection
  • query-optimizer-db.sh: Query testing
  • Various chromem inspection tools

Documentation

  • Optimizer package documentation in pkg/optimizer/README.md
  • Integration guide in pkg/optimizer/INTEGRATION.md
  • CRD API documentation updates
  • Example configurations

Testing

  • Comprehensive unit tests for all optimizer components
  • Integration tests for optimizer endpoints
  • Semantic search test suite
  • String matching test suite
  • E2E tests for Kubernetes deployments

Dependencies

  • chromem-go: Vector database for embeddings
  • sqlite-vec: SQLite extension for vector similarity search
  • go.uber.org/mock: Mock generation for tests

Build Requirements

  • Requires -tags="fts5" build flag for FTS5 support
  • SQLite with FTS5 extension
  • sqlite-vec extension for vector search

Related

Large PR Justification

  • This is the second part of a two part PR.

@therealnb therealnb requested a review from jerm-dro January 26, 2026 12:05
@therealnb therealnb force-pushed the optimizer-implementation branch from fac6152 to d809cfa Compare January 26, 2026 12:13
@therealnb therealnb changed the base branch from main to optimizer-enablers January 26, 2026 12:13
@github-actions github-actions bot added the size/XL Extra large PR: 1000+ lines changed label Jan 26, 2026
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 26, 2026
@therealnb therealnb force-pushed the optimizer-implementation branch from 8d707ff to 5c0713a Compare January 26, 2026 12:17
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 26, 2026
@therealnb therealnb force-pushed the optimizer-implementation branch from 5c0713a to 16bbbfc Compare January 26, 2026 12:25
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 26, 2026
@github-actions github-actions bot dismissed their stale review January 26, 2026 15:27

Large PR justification has been provided. Thank you!

@github-actions
Copy link
Contributor

✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review.

@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 26, 2026
@codecov
Copy link

codecov bot commented Jan 26, 2026

Codecov Report

❌ Patch coverage is 45.36862% with 867 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.54%. Comparing base (e00d514) to head (7fbba1a).
⚠️ Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
pkg/vmcp/optimizer/optimizer.go 10.07% 352 Missing and 5 partials ⚠️
pkg/vmcp/optimizer/internal/ingestion/service.go 0.00% 157 Missing ⚠️
pkg/vmcp/optimizer/internal/db/fts.go 69.30% 48 Missing and 14 partials ⚠️
pkg/vmcp/optimizer/internal/embeddings/ollama.go 26.22% 42 Missing and 3 partials ⚠️
pkg/vmcp/optimizer/internal/embeddings/manager.go 47.56% 32 Missing and 11 partials ⚠️
pkg/vmcp/optimizer/internal/db/backend_tool.go 63.30% 20 Missing and 20 partials ⚠️
pkg/vmcp/server/server.go 10.00% 23 Missing and 4 partials ⚠️
pkg/vmcp/optimizer/internal/db/hybrid.go 64.86% 17 Missing and 9 partials ⚠️
pkg/vmcp/optimizer/internal/db/db.go 75.58% 17 Missing and 4 partials ⚠️
cmd/vmcp/app/commands.go 0.00% 18 Missing ⚠️
... and 10 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3440      +/-   ##
==========================================
- Coverage   65.15%   64.54%   -0.62%     
==========================================
  Files         398      413      +15     
  Lines       38821    40431    +1610     
==========================================
+ Hits        25295    26097     +802     
- Misses      11564    12294     +730     
- Partials     1962     2040      +78     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 26, 2026
@therealnb therealnb requested a review from rdimitrov as a code owner January 28, 2026 13:58
@github-actions github-actions bot removed the size/XL Extra large PR: 1000+ lines changed label Jan 28, 2026
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 28, 2026
Add nil receiver checks to IngestInitialBackends, OnRegisterSession,
and Close methods to prevent panics when called on nil *EmbeddingOptimizer.

The tests explicitly test nil integration handling, so these methods
must safely handle nil receivers.

Fixes:
- TestClose_NilIntegration panic
- TestIngestInitialBackends_NilIntegration panic
- TestOnRegisterSession_NilIntegration panic
- All related optimizer unit test failures
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 28, 2026
- Fix line length violations (lll) by wrapping long lines
- Remove unused processedSessions field from EmbeddingOptimizer
- Remove unused sync import
- Change unused receivers to _ in convertSearchResults and resolveToolTarget
- Rename unused ctx parameter to _ in NewEmbeddingOptimizer
- Remove unused deserializeServerMetadata, update, and delete functions
- Simplify createTestDatabase to return only Database (not unused embeddingFunc)
- Add nolint directive for OptimizerIntegration type alias (kept for test compatibility)

Fixes all golangci-lint errors:
- lll: 2 line length violations
- revive: 4 unused parameter/receiver issues
- unparam: 1 unused return value
- unused: 4 unused functions/fields
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 28, 2026
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 28, 2026
#3471

Signed-off-by: nigel brown <nigel@stacklok.com>
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 28, 2026
@therealnb
Copy link
Contributor Author

Retested. We definitely need #3471 to stop a race condidtion.

@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 28, 2026
The serviceVersion field in telemetry config is documented as optional
with a default of the ToolHive version, but the code was passing empty
strings to WithServiceVersion() which requires a non-empty value.

This fix applies the default value (from versions.GetVersionInfo().Version)
when serviceVersion is omitted, making it truly optional as documented.

Fixes error: "service version cannot be empty" when telemetry is enabled
without an explicit serviceVersion in the config.

Bug was introduced in commit 64eb12e (PR #3207).
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 29, 2026
@therealnb
Copy link
Contributor Author

I've split PR #3440 into three smaller PRs:

Created PRs

PR Branch Title URL
PR 1 optimizer-pr1-packages Add optimizer packages and infrastructure (non-functional) #3516
PR 2 optimizer-pr2-integration Wire optimizer into server startup and CLI #3517
PR 3 optimizer-pr3-docs Add optimizer documentation and README updates #3518

What's in Each PR

PR 1 (Non-functional): ~10,400 lines

  • All new optimizer internal packages (db/, embeddings/, ingestion/, models/, tokens/)
  • EmbeddingOptimizer implementation and updated Optimizer interface
  • Updated dummy_optimizer.go for backward compatibility
  • Dependencies (go.mod/go.sum)
  • Config schema and CRD updates
  • OptimizerHandlerProvider interface in adapter

PR 2 (Integration): ~630 lines added, ~770 removed

  • Server wiring (server.go, capability_adapter.go)
  • CLI integration (commands.go)
  • Operator converter changes
  • Taskfile.yml build flags
  • Deletes dummy_optimizer.go
  • Test updates for integration

PR 3 (Documentation): ~385 lines

  • README updates (cmd/vmcp/README.md)
  • New docs (pkg/vmcp/optimizer/README.md, REFACTORING.md)
  • CRD API docs and Helm chart README updates

Merge Order

  1. Merge PR 1 first (standalone, no dependencies)
  2. Merge PR 2 after PR 1 (depends on PR 1)
  3. Merge PR 3 after PR 2 (depends on PR 1 & 2)

You will also need #3471 to handle a race condition if this has not been addressed elsewhere.

Note: There's a known merge conflict when combining PR1 and PR2 - PR1 modifies dummy_optimizer.go while PR2 deletes it. The resolution is to accept the deletion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Extra large PR: 1000+ lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants