Add MCP optimizer implementation for semantic tool discovery by therealnb · Pull Request #3440 · stacklok/toolhive

therealnb · 2026-01-26T12:03:48Z

Add MCP Optimizer Implementation for Semantic Tool Discovery

This PR adds the complete MCP optimizer implementation to vMCP, enabling semantic tool discovery and reducing token usage for LLMs working with large toolsets.

Overview

The optimizer allows vMCP to expose optim.find_tool and optim.call_tool operations instead of all backend tools directly. This reduces token usage by allowing LLMs to discover relevant tools on demand via semantic search rather than receiving all tool definitions upfront.

Features

Core Optimizer Package

Semantic Tool Search (pkg/optimizer/)

Vector embeddings (384-dim) for semantic similarity search
Full-text search via SQLite FTS5 for BM25 text matching
Hybrid search combining semantic and BM25 results (configurable ratio)
Multiple embedding backends:
- Ollama (local HTTP API)
- OpenAI-compatible (vLLM, OpenAI, etc.)
- Placeholder (deterministic hash-based, for testing)

Token Counting (pkg/optimizer/tokens/)

LLM cost estimation based on token counts
Supports monitoring token usage and optimization effectiveness

Database Layer (pkg/optimizer/db/)

SQLite-based storage with sqlite-vec for vector similarity search
FTS5 for full-text search
Hybrid search implementation combining both approaches
Persistent and in-memory storage options

Ingestion Service (pkg/optimizer/ingestion/)

Ingests tools from all backends in the group
Generates embeddings for tool metadata
Maintains searchable index of all available tools

vMCP Integration

Optimizer Endpoints (pkg/vmcp/optimizer/)

optim.find_tool: Semantic and string-based tool discovery
optim.call_tool: Tool invocation with automatic routing
Integration with vMCP server lifecycle
Comprehensive test coverage (unit, integration, semantic search, string matching)

Server Integration (pkg/vmcp/server/)

Optimizer initialization and lifecycle management
Session-based optimizer instances
Integration with discovery manager and backend registry

Router Updates (pkg/vmcp/router/)

Special handling for optim_* prefixed tools
Prevents routing optimizer tools to backends (handled by vMCP itself)

Kubernetes Operator Support

Service Resolution (cmd/thv-operator/pkg/vmcpconfig/converter.go)

Resolves Kubernetes Service names to URLs for embedding services
Handles embeddingService → embeddingURL conversion
Supports in-cluster deployments

CRD Schema (deploy/charts/operator-crds/)

Complete optimizer configuration schema
Supports all optimizer features (embeddings, persistence, hybrid search)
Documentation updates

Configuration

OptimizerConfig (pkg/vmcp/config/config.go)

Comprehensive configuration options:
- enabled: Enable/disable optimizer
- embeddingBackend: Choose embedding provider
- embeddingURL: Embedding service URL
- embeddingModel: Model name for embeddings
- embeddingDimension: Vector dimension
- persistPath: Optional persistence path
- ftsDBPath: FTS5 database path
- hybridSearchRatio: Semantic vs BM25 mix (0-100%)
- embeddingService: Kubernetes service name (K8s only)

CLI Integration (cmd/vmcp/app/commands.go)

Optimizer configuration parsing from YAML
Runtime configuration and initialization
Logging and status reporting

Build System

Build Tags (Taskfile.yml)

Added -tags="fts5" build flag for SQLite FTS5 support
Required for optimizer functionality
Applied to all vmcp builds (build, install)

Test Task (Taskfile.yml)

Added test-optimizer task for optimizer integration tests
Uses sqlite-vec for vector search testing

Examples & Scripts

Example Configuration (examples/vmcp-config-optimizer.yaml)

Complete example showing optimizer configuration
Demonstrates all configuration options

Helper Scripts (scripts/)

test-optimizer-with-sqlite-vec.sh: Integration testing
inspect-optimizer-db.sh: Database inspection
query-optimizer-db.sh: Query testing
Various chromem inspection tools

Documentation

Optimizer package documentation in pkg/optimizer/README.md
Integration guide in pkg/optimizer/INTEGRATION.md
CRD API documentation updates
Example configurations

Testing

Comprehensive unit tests for all optimizer components
Integration tests for optimizer endpoints
Semantic search test suite
String matching test suite
E2E tests for Kubernetes deployments

Dependencies

chromem-go: Vector database for embeddings
sqlite-vec: SQLite extension for vector similarity search
go.uber.org/mock: Mock generation for tests

Build Requirements

Requires -tags="fts5" build flag for FTS5 support
SQLite with FTS5 extension
sqlite-vec extension for vector search

Large PR Justification

This is the second part of a two part PR.

github-actions

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.

This review will be automatically dismissed once you add the justification section.

Large PR justification has been provided. Thank you!

github-actions · 2026-01-26T15:27:34Z

✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review.

codecov · 2026-01-26T15:35:46Z

Codecov Report

❌ Patch coverage is 45.36862% with 867 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.54%. Comparing base (e00d514) to head (7fbba1a).
⚠️ Report is 6 commits behind head on main.

Files with missing lines	Patch %	Lines
pkg/vmcp/optimizer/optimizer.go	10.07%	352 Missing and 5 partials ⚠️
pkg/vmcp/optimizer/internal/ingestion/service.go	0.00%	157 Missing ⚠️
pkg/vmcp/optimizer/internal/db/fts.go	69.30%	48 Missing and 14 partials ⚠️
pkg/vmcp/optimizer/internal/embeddings/ollama.go	26.22%	42 Missing and 3 partials ⚠️
pkg/vmcp/optimizer/internal/embeddings/manager.go	47.56%	32 Missing and 11 partials ⚠️
pkg/vmcp/optimizer/internal/db/backend_tool.go	63.30%	20 Missing and 20 partials ⚠️
pkg/vmcp/server/server.go	10.00%	23 Missing and 4 partials ⚠️
pkg/vmcp/optimizer/internal/db/hybrid.go	64.86%	17 Missing and 9 partials ⚠️
pkg/vmcp/optimizer/internal/db/db.go	75.58%	17 Missing and 4 partials ⚠️
cmd/vmcp/app/commands.go	0.00%	18 Missing ⚠️
... and 10 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3440      +/-   ##
==========================================
- Coverage   65.15%   64.54%   -0.62%     
==========================================
  Files         398      413      +15     
  Lines       38821    40431    +1610     
==========================================
+ Hits        25295    26097     +802     
- Misses      11564    12294     +730     
- Partials     1962     2040      +78

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Add nil receiver checks to IngestInitialBackends, OnRegisterSession, and Close methods to prevent panics when called on nil *EmbeddingOptimizer. The tests explicitly test nil integration handling, so these methods must safely handle nil receivers. Fixes: - TestClose_NilIntegration panic - TestIngestInitialBackends_NilIntegration panic - TestOnRegisterSession_NilIntegration panic - All related optimizer unit test failures

- Fix line length violations (lll) by wrapping long lines - Remove unused processedSessions field from EmbeddingOptimizer - Remove unused sync import - Change unused receivers to _ in convertSearchResults and resolveToolTarget - Rename unused ctx parameter to _ in NewEmbeddingOptimizer - Remove unused deserializeServerMetadata, update, and delete functions - Simplify createTestDatabase to return only Database (not unused embeddingFunc) - Add nolint directive for OptimizerIntegration type alias (kept for test compatibility) Fixes all golangci-lint errors: - lll: 2 line length violations - revive: 4 unused parameter/receiver issues - unparam: 1 unused return value - unused: 4 unused functions/fields

#3471 Signed-off-by: nigel brown <nigel@stacklok.com>

therealnb · 2026-01-28T17:37:30Z

Retested. We definitely need #3471 to stop a race condidtion.

The serviceVersion field in telemetry config is documented as optional with a default of the ToolHive version, but the code was passing empty strings to WithServiceVersion() which requires a non-empty value. This fix applies the default value (from versions.GetVersionInfo().Version) when serviceVersion is omitted, making it truly optional as documented. Fixes error: "service version cannot be empty" when telemetry is enabled without an explicit serviceVersion in the config. Bug was introduced in commit 64eb12e (PR #3207).

therealnb · 2026-01-30T11:43:29Z

I've split PR #3440 into three smaller PRs:

Created PRs

PR	Branch	Title	URL
PR 1	`optimizer-pr1-packages`	Add optimizer packages and infrastructure (non-functional)	#3516
PR 2	`optimizer-pr2-integration`	Wire optimizer into server startup and CLI	#3517
PR 3	`optimizer-pr3-docs`	Add optimizer documentation and README updates	#3518

What's in Each PR

PR 1 (Non-functional): ~10,400 lines

All new optimizer internal packages (db/, embeddings/, ingestion/, models/, tokens/)
EmbeddingOptimizer implementation and updated Optimizer interface
Updated dummy_optimizer.go for backward compatibility
Dependencies (go.mod/go.sum)
Config schema and CRD updates
OptimizerHandlerProvider interface in adapter

PR 2 (Integration): ~630 lines added, ~770 removed

Server wiring (server.go, capability_adapter.go)
CLI integration (commands.go)
Operator converter changes
Taskfile.yml build flags
Deletes dummy_optimizer.go
Test updates for integration

PR 3 (Documentation): ~385 lines

README updates (cmd/vmcp/README.md)
New docs (pkg/vmcp/optimizer/README.md, REFACTORING.md)
CRD API docs and Helm chart README updates

Merge Order

Merge PR 1 first (standalone, no dependencies)
Merge PR 2 after PR 1 (depends on PR 1)
Merge PR 3 after PR 2 (depends on PR 1 & 2)

You will also need #3471 to handle a race condition if this has not been addressed elsewhere.

Note: There's a known merge conflict when combining PR1 and PR2 - PR1 modifies dummy_optimizer.go while PR2 deletes it. The resolution is to accept the deletion.

therealnb mentioned this pull request Jan 26, 2026

Merge jerm/2026-01-13-optimizer-in-vmcp into main #3373

Closed

therealnb requested a review from jerm-dro January 26, 2026 12:05

therealnb force-pushed the optimizer-implementation branch from fac6152 to d809cfa Compare January 26, 2026 12:13

therealnb changed the base branch from main to optimizer-enablers January 26, 2026 12:13

github-actions bot added the size/XL Extra large PR: 1000+ lines changed label Jan 26, 2026

github-actions bot previously requested changes Jan 26, 2026

View reviewed changes

github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 26, 2026

therealnb force-pushed the optimizer-enablers branch from a5aa704 to 9e28406 Compare January 26, 2026 12:16

therealnb force-pushed the optimizer-implementation branch from 8d707ff to 5c0713a Compare January 26, 2026 12:17

github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 26, 2026

therealnb force-pushed the optimizer-implementation branch from 5c0713a to 16bbbfc Compare January 26, 2026 12:25

github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 26, 2026

therealnb requested a review from rdimitrov as a code owner January 28, 2026 13:58

github-actions bot removed the size/XL Extra large PR: 1000+ lines changed label Jan 28, 2026

Merge branch 'main' into optimizer-implementation

3ea3887

github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 28, 2026

Merge branch 'main' into optimizer-implementation

21f90c6

github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 28, 2026

therealnb mentioned this pull request Jan 28, 2026

Fix race condition in discovery manager causing duplicate aggregations #3471

Open

This is a separate PR

7961f89

#3471 Signed-off-by: nigel brown <nigel@stacklok.com>

github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 28, 2026

Merge branch 'main' into optimizer-implementation

cd16c7b

github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 28, 2026

github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 29, 2026

This was referenced Jan 29, 2026

Add optimizer packages and infrastructure (non-functional) #3516

Open

Wire optimizer into server startup and CLI #3517

Open

Add optimizer documentation and README updates #3518

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MCP optimizer implementation for semantic tool discovery#3440

Add MCP optimizer implementation for semantic tool discovery#3440
therealnb wants to merge 72 commits intomainfrom
optimizer-implementation

therealnb commented Jan 26, 2026 •

edited

Loading

Uh oh!

github-actions bot left a comment

Uh oh!

github-actions bot commented Jan 26, 2026

Uh oh!

codecov bot commented Jan 26, 2026 •

edited

Loading

Uh oh!

therealnb commented Jan 28, 2026

Uh oh!

therealnb commented Jan 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

therealnb commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add MCP Optimizer Implementation for Semantic Tool Discovery

Overview

Features

Core Optimizer Package

vMCP Integration

Kubernetes Operator Support

Configuration

Build System

Examples & Scripts

Documentation

Testing

Dependencies

Build Requirements

Related

Large PR Justification

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Large PR Detected

How to unblock this PR:

Alternative:

Uh oh!

github-actions bot commented Jan 26, 2026

Uh oh!

codecov bot commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

therealnb commented Jan 28, 2026

Uh oh!

therealnb commented Jan 30, 2026

Created PRs

What's in Each PR

Merge Order

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

therealnb commented Jan 26, 2026 •

edited

Loading

codecov bot commented Jan 26, 2026 •

edited

Loading