Merge jerm/2026-01-13-optimizer-in-vmcp into main #3373

therealnb · 2026-01-21T15:36:34Z

Merge jerm/2026-01-13-optimizer-in-vmcp into main

This PR merges 17 commits that integrate the MCP optimizer into vMCP, adding semantic tool discovery, observability, Kubernetes support, and various bug fixes and improvements.

Core Optimizer Integration

Add Optimizer Package (#3253)

Introduced optimizer package - Go port of the mcp-optimizer Python service
Semantic tool search using vector embeddings (384-dim)
Token counting for LLM cost estimation
Full-text search via SQLite FTS5
Multiple embedding backends: Ollama, vLLM, or placeholder (testing)
Production-ready database with sqlite-vec for vector similarity search

Add Optimizer Integration Endpoints (#3318)

Added find_tool and call_tool endpoints to vMCP optimizer
Implemented semantic search and string matching for tool discovery
Updated optimizer integration documentation
Added test scripts for optimizer functionality

Resolve Tool Names in optim.find_tool (#3337)

Fixed tool name resolution to match routing table
Ensures consistent tool discovery and routing

Observability & Metrics

Add Token Metrics and Observability (#3347)

Added comprehensive token metrics to optimizer integration
Enables monitoring of token usage and optimization effectiveness

Add OpenTelemetry Tracing to Capability Aggregation

Added tracing spans to all aggregator methods for visibility in Jaeger
Includes spans for:
- AggregateCapabilities (parent span)
- QueryAllCapabilities (parallel backend queries)
- QueryCapabilities (per-backend queries)
- ResolveConflicts (conflict resolution)
- MergeCapabilities (final merge)
All spans include relevant attributes like backend counts, tool/resource/prompt counts, and error recording

Kubernetes Integration

Add Dynamic/Static Mode Support (#3235)

Added dynamic/static mode support to VirtualMCPServer operator
Enables flexible deployment configurations

Add DeepCopy and Kubernetes Service Resolution

Used DeepCopy() for automatic passthrough of config fields (Optimizer, Metadata, etc.)
Added resolveEmbeddingService() to resolve Kubernetes Service names to URLs
Ensures optimizer config is properly converted from CRD to runtime config
Resolves embeddingService references in Kubernetes deployments

Kubernetes Optimizer Integration Fixes (#3359)

Added CLI fallback for embeddingService when not resolved by operator
Normalized localhost to 127.0.0.1 in embeddings to avoid IPv6 issues
Added HTTP timeout (30s) to prevent hanging connections
Removed WithContinuousListening() to use timeout-based approach

Testing & Reliability Improvements

Run API E2E Test Server as Standalone Process (#3356)

Changed test server to run as standalone process instead of in-process
Uses full binary to ensure realistic test scenarios

Fix Flaky E2E Tests

Add HTTP client timeout to health check (10s timeout)
Add pod readiness checks before health endpoint verification
Skip completed pods in checkPodsReady to prevent flaky test failures
Improved error messages to help diagnose connection reset issues

Fix Unrecognized Dotty Names

Fixed issue with unrecognized dotty names in the codebase

Infrastructure

Bump Operator CRDs Chart Version

Updated operator-crds chart version to 0.0.97 after rebase

Documentation Updates

Updated vmcp/README with optimizer integration information

Summary

This PR consolidates the complete integration of the MCP optimizer into vMCP, enabling semantic tool discovery, reducing token usage, and providing comprehensive observability. The integration includes full Kubernetes support, robust error handling, and improved test reliability.

Related PRs

Add dynamic/static mode support to VirtualMCPServer operator #3235: Add dynamic/static mode support to VirtualMCPServer operator
feat: Add optimizer package with semantic tool discovery and ingestion #3253: feat: Add optimizer package with semantic tool discovery and ingestion
feat: Add optimizer integration endpoints and tool discovery #3318: feat: Add optimizer integration endpoints and tool discovery
fix: Resolve tool names in optim.find_tool to match routing table #3337: fix: Resolve tool names in optim.find_tool to match routing table
Add token metrics and observability to optimizer integration #3347: Add token metrics and observability to optimizer integration
Run the API E2E test server as a standalone process #3356: Run the API E2E test server as a standalone process
feat: Add optimizer integration for semantic tool discovery in vMCP #3359: Kubernetes optimizer integration fixes

Large PR Justification

Generated code that cannot be split
Large new feature that must be atomic
Multiple related changes that would break if separated

github-actions

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.

This review will be automatically dismissed once you add the justification section.

therealnb · 2026-01-21T15:48:05Z

Here's the demo scripts I mentioned #3375

deploy/charts/operator-crds/README.md

cmd/thv-operator/Taskfile.yml

ChrisJBurns · 2026-01-21T15:59:37Z

pkg/optimizer/INTEGRATION.md

@@ -0,0 +1,134 @@
+# Integrating Optimizer with vMCP


if the optimizer is only being used by the Operator, I would put it into cmd/thv-operator/pkg instead. Because this makes it easier for us to rip out the Operator into its own repo in future. It also allows us to keep local ToolHive CLI a separate as possible.

I was initially testing in docker mode (it does work with docker) but k8s is the main target. I'm happy either way.

+1 to chris's suggestion

codecov · 2026-01-21T16:18:08Z

Codecov Report

❌ Patch coverage is 27.92887% with 1378 lines in your changes missing coverage. Please review.
✅ Project coverage is 62.99%. Comparing base (8571282) to head (f4ea6c7).
⚠️ Report is 11 commits behind head on main.

Files with missing lines	Patch %	Lines
pkg/vmcp/optimizer/optimizer.go	6.41%	434 Missing and 4 partials ⚠️
pkg/optimizer/db/fts.go	10.89%	175 Missing and 5 partials ⚠️
pkg/optimizer/ingestion/service.go	0.00%	157 Missing ⚠️
pkg/optimizer/db/backend_tool.go	9.80%	137 Missing and 1 partial ⚠️
pkg/optimizer/db/backend_server.go	12.28%	99 Missing and 1 partial ⚠️
pkg/vmcp/server/server.go	31.20%	89 Missing and 8 partials ⚠️
pkg/optimizer/db/hybrid.go	0.00%	74 Missing ⚠️
pkg/optimizer/embeddings/ollama.go	26.22%	42 Missing and 3 partials ⚠️
pkg/optimizer/embeddings/manager.go	47.56%	32 Missing and 11 partials ⚠️
cmd/vmcp/app/commands.go	0.00%	28 Missing ⚠️
... and 9 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3373      +/-   ##
==========================================
- Coverage   64.85%   62.99%   -1.86%     
==========================================
  Files         375      390      +15     
  Lines       36626    38486    +1860     
==========================================
+ Hits        23753    24246     +493     
- Misses      10999    12333    +1334     
- Partials     1874     1907      +33

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Large PR justification has been provided. Thank you!

github-actions · 2026-01-21T17:19:28Z

✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review.

#3253) * feat: Add optimizer package with semantic tool discovery and ingestion This PR introduces the optimizer package, a Go port of the mcp-optimizer Python service that provides semantic tool discovery and ingestion for MCP servers. - **Semantic tool search** using vector embeddings (384-dim) - **Token counting** for LLM cost estimation - **Full-text search** via SQLite FTS5 - **Multiple embedding backends**: Ollama, vLLM, or placeholder (testing) - **Production-ready database** with sqlite-vec for vector similarity search

* feat: Add optimizer integration endpoints and tool discovery - Add find_tool and call_tool endpoints to vmcp optimizer - Add semantic search and string matching for tool discovery - Update optimizer integration documentation - Add test scripts for optimizer functionality

) * fix: Resolve tool names in optim.find_tool to match routing table

* feat: Add token metrics and observability to optimizer integration

…failures The checkPodsReady function was checking all pods with matching labels, including old pods that had completed (Phase: Succeeded) from previous deployments. This caused the auth discovery e2e test to fail when old pods were still present during deployment updates. Fix: Skip pods that are not in Running phase and ensure at least one running pod exists after filtering.

The test was failing with 'connection reset by peer' errors when trying to connect to the health endpoint. This can happen if pods crash or restart between the BeforeAll setup and the actual test execution. Fix: Add explicit pod readiness verification right before the health check and also check pod readiness inside the Eventually loop to catch pods that crash during health check retries. This makes the test more robust by ensuring pods are stable before attempting HTTP connections.

- Update all Go files to use SPDX license header format - Fix unused receiver in convertAuditConfig method - Fix optimizer test to properly skip when Ollama model not available - Fix port test to use port 9999 instead of 9000 to avoid conflicts

Change HybridSearchRatio from *float64 (0.0-1.0) to *int (0-100 percentage) to avoid needing allowDangerousTypes=true in controller-gen. - Update type definitions in config, optimizer, and server packages - Update conversion logic in hybrid.go to convert percentage to ratio - Update all test files with new percentage values - Update config files, examples, and documentation - Remove allowDangerousTypes=true from Taskfile.yml This is a breaking change: users need to update configs from 0.7 to 70, etc.

Signed-off-by: nigel brown <[email protected]>

pkg/vmcp/aggregator/default_aggregator.go

therealnb · 2026-01-21T18:18:49Z

scripts/call-optim-find-tool/main.go

@@ -0,0 +1,140 @@
+// SPDX-FileCopyrightText: Copyright 2025 Stacklok, Inc.


The changes in this and parent dir are for inspecting the local db. They could be ditched, but they are useful for debugging.

therealnb · 2026-01-21T18:19:41Z

.gitignore

+crd-helm-wrapper
+cmd/vmcp/__debug_bin*
+
+# Demo files


This is to stop me checking these in by accident. This can go.

Signed-off-by: nigel brown <[email protected]>

jerm-dro

I suspect something is off with the diff here. I'm seeing some changes that I don't expect in this PR.

jerm-dro · 2026-01-21T23:58:01Z

cmd/thv-operator/controllers/mcpremoteproxy_controller_test.go

+// SPDX-FileCopyrightText: Copyright 2025 Stacklok, Inc.
+// SPDX-License-Identifier: Apache-2.0

 package controllers


These diffs don't seem like they belong in this commit. Can you remove them?

The tests wouldn't pass without them. Is there another choice?

Also, these spdx identifiers can't really co-exist with another license at the top.

cmd/thv-operator/pkg/vmcpconfig/converter.go

jerm-dro · 2026-01-22T00:08:47Z

Taskfile.yml

      BUILD_DATE: '{{dateInZone "2006-01-02T15:04:05Z" (now) "UTC"}}'
    cmds:
-      - go install -ldflags "-s -w -X github.com/stacklok/toolhive/pkg/versions.Version={{.VERSION}} -X github.com/stacklok/toolhive/pkg/versions.Commit={{.COMMIT}} -X github.com/stacklok/toolhive/pkg/versions.BuildDate={{.BUILD_DATE}}" -v ./cmd/vmcp
+      - go install -tags="fts5" -ldflags "-s -w -X github.com/stacklok/toolhive/pkg/versions.Version={{.VERSION}} -X github.com/stacklok/toolhive/pkg/versions.Commit={{.COMMIT}} -X github.com/stacklok/toolhive/pkg/versions.BuildDate={{.BUILD_DATE}}" -v ./cmd/vmcp


Will the build fail if you don't include this? Or will problems only be encountered at runtime?

I ask because you had to add it to a lot of places. If you missed one, I wouldn't want that to result in runtime failures.

It is an optional sqlite module. If it isn't there there will be a runtime "no such module: fts5" and it will crash. It is needed for BM25 searches.

I can't see a way round including this (can you?).

pkg/vmcp/schema/reflect_test.go

pkg/vmcp/config/config.go

pkg/vmcp/server/adapter/optimizer_adapter.go

Reinstated the deleted tests in pkg/vmcp/schema/reflect_test.go that were removed in commit 38cf21e. Updated the tests to work with the current optimizer implementation by: - Creating FindToolInput and CallToolInput test types that match the current optimizer tool schemas (optim_find_tool and optim_call_tool) - Updating tests to reflect current schema (tool_keywords as string instead of array, added limit and backend_id fields) - All tests now pass and validate schema generation and translation functions work correctly with optimizer tool inputs

Remove the EmbeddingService field from OptimizerConfig and all related conversion logic. Users should now provide the full service URL including port for in-cluster services (e.g., http://service-name.namespace.svc.cluster.local:port). This simplifies the codebase by removing Kubernetes-specific service resolution logic and making the configuration more explicit and platform-agnostic.

- Restore pkg/vmcp/server/adapter/optimizer_adapter.go with original structure - Use optim_ prefix for tool names (optim_find_tool, optim_call_tool) - Remove router check for optim_ prefix (optimizer tools don't go through router) - Eliminate schema duplication by defining schemas once in optimizer_adapter.go - Update server to use adapter.CreateOptimizerTools() directly - Remove obsolete EmbeddingService references from commands.go - Fix .gitignore pattern to avoid ignoring vmcp source files

therealnb force-pushed the jerm/2026-01-13-optimizer-in-vmcp branch from 22e020c to 3f3a011 Compare January 21, 2026 15:40

github-actions bot previously requested changes Jan 21, 2026

View reviewed changes

github-actions bot added the size/XL Extra large PR: 1000+ lines changed label Jan 21, 2026

therealnb mentioned this pull request Jan 21, 2026

Optimizer in vMCP #3278

Closed

ChrisJBurns reviewed Jan 21, 2026

View reviewed changes

deploy/charts/operator-crds/README.md Outdated Show resolved Hide resolved

ChrisJBurns reviewed Jan 21, 2026

View reviewed changes

github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 21, 2026

github-actions bot added the size/XL Extra large PR: 1000+ lines changed label Jan 21, 2026

github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 21, 2026

jerm-dro and others added 8 commits January 21, 2026 18:02

Update vmcp/README

fde2423

fix: Resolve tool names in optim.find_tool to match routing table (#3337

f74e63d

) * fix: Resolve tool names in optim.find_tool to match routing table

Add token metrics and observability to optimizer integration (#3347)

00fff43

* feat: Add token metrics and observability to optimizer integration

fix: Bump operator-crds chart version to 0.0.97 after rebase

bf77813

therealnb added 6 commits January 21, 2026 18:02

Add license headers to all optimizer package files

9c8ec1b

Fix linting errors: gci formatting, gocyclo complexity, lll line length

bc1f4ca

Add license headers to remaining files missing SPDX headers

0310bda

demo scripts

91a210d

Signed-off-by: nigel brown <[email protected]>

therealnb force-pushed the jerm/2026-01-13-optimizer-in-vmcp branch from 1f6f22b to 91a210d Compare January 21, 2026 18:02

github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 21, 2026

therealnb commented Jan 21, 2026

View reviewed changes

pkg/vmcp/aggregator/default_aggregator.go Show resolved Hide resolved

therealnb commented Jan 21, 2026

View reviewed changes

More spdx fails

47c811b

Signed-off-by: nigel brown <[email protected]>

github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 21, 2026

Merge branch 'main' into jerm/2026-01-13-optimizer-in-vmcp

437a8d4

github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 21, 2026

jerm-dro requested changes Jan 22, 2026

View reviewed changes

github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jan 22, 2026

		@@ -0,0 +1,140 @@
		// SPDX-FileCopyrightText: Copyright 2025 Stacklok, Inc.

Merge jerm/2026-01-13-optimizer-in-vmcp into main #3373

Are you sure you want to change the base?

Merge jerm/2026-01-13-optimizer-in-vmcp into main #3373

Conversation

therealnb commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge jerm/2026-01-13-optimizer-in-vmcp into main

Core Optimizer Integration

Add Optimizer Package (#3253)

Add Optimizer Integration Endpoints (#3318)

Resolve Tool Names in optim.find_tool (#3337)

Observability & Metrics

Add Token Metrics and Observability (#3347)

Add OpenTelemetry Tracing to Capability Aggregation

Kubernetes Integration

Add Dynamic/Static Mode Support (#3235)

Add DeepCopy and Kubernetes Service Resolution

Kubernetes Optimizer Integration Fixes (#3359)

Testing & Reliability Improvements

Run API E2E Test Server as Standalone Process (#3356)

Fix Flaky E2E Tests

Fix Unrecognized Dotty Names

Infrastructure

Bump Operator CRDs Chart Version

Documentation Updates

Summary

Related PRs

Large PR Justification

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Large PR Detected

How to unblock this PR:

Alternative:

Uh oh!

therealnb commented Jan 21, 2026

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Jan 21, 2026

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jerm-dro left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

therealnb commented Jan 21, 2026 •

edited

Loading

codecov bot commented Jan 21, 2026 •

edited

Loading