-
Notifications
You must be signed in to change notification settings - Fork 170
Merge jerm/2026-01-13-optimizer-in-vmcp into main #3373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
22e020c to
3f3a011
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
|
Here's the demo scripts I mentioned #3375 |
| @@ -0,0 +1,134 @@ | |||
| # Integrating Optimizer with vMCP | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the optimizer is only being used by the Operator, I would put it into cmd/thv-operator/pkg instead. Because this makes it easier for us to rip out the Operator into its own repo in future. It also allows us to keep local ToolHive CLI a separate as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was initially testing in docker mode (it does work with docker) but k8s is the main target. I'm happy either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to chris's suggestion
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3373 +/- ##
==========================================
- Coverage 64.85% 62.99% -1.86%
==========================================
Files 375 390 +15
Lines 36626 38486 +1860
==========================================
+ Hits 23753 24246 +493
- Misses 10999 12333 +1334
- Partials 1874 1907 +33 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Large PR justification has been provided. Thank you!
|
✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review. |
#3253) * feat: Add optimizer package with semantic tool discovery and ingestion This PR introduces the optimizer package, a Go port of the mcp-optimizer Python service that provides semantic tool discovery and ingestion for MCP servers. - **Semantic tool search** using vector embeddings (384-dim) - **Token counting** for LLM cost estimation - **Full-text search** via SQLite FTS5 - **Multiple embedding backends**: Ollama, vLLM, or placeholder (testing) - **Production-ready database** with sqlite-vec for vector similarity search
* feat: Add optimizer integration endpoints and tool discovery - Add find_tool and call_tool endpoints to vmcp optimizer - Add semantic search and string matching for tool discovery - Update optimizer integration documentation - Add test scripts for optimizer functionality
* feat: Add token metrics and observability to optimizer integration
…failures The checkPodsReady function was checking all pods with matching labels, including old pods that had completed (Phase: Succeeded) from previous deployments. This caused the auth discovery e2e test to fail when old pods were still present during deployment updates. Fix: Skip pods that are not in Running phase and ensure at least one running pod exists after filtering.
The test was failing with 'connection reset by peer' errors when trying to connect to the health endpoint. This can happen if pods crash or restart between the BeforeAll setup and the actual test execution. Fix: Add explicit pod readiness verification right before the health check and also check pod readiness inside the Eventually loop to catch pods that crash during health check retries. This makes the test more robust by ensuring pods are stable before attempting HTTP connections.
- Update all Go files to use SPDX license header format - Fix unused receiver in convertAuditConfig method - Fix optimizer test to properly skip when Ollama model not available - Fix port test to use port 9999 instead of 9000 to avoid conflicts
Change HybridSearchRatio from *float64 (0.0-1.0) to *int (0-100 percentage) to avoid needing allowDangerousTypes=true in controller-gen. - Update type definitions in config, optimizer, and server packages - Update conversion logic in hybrid.go to convert percentage to ratio - Update all test files with new percentage values - Update config files, examples, and documentation - Remove allowDangerousTypes=true from Taskfile.yml This is a breaking change: users need to update configs from 0.7 to 70, etc.
Signed-off-by: nigel brown <[email protected]>
1f6f22b to
91a210d
Compare
| @@ -0,0 +1,140 @@ | |||
| // SPDX-FileCopyrightText: Copyright 2025 Stacklok, Inc. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in this and parent dir are for inspecting the local db. They could be ditched, but they are useful for debugging.
| crd-helm-wrapper | ||
| cmd/vmcp/__debug_bin* | ||
|
|
||
| # Demo files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to stop me checking these in by accident. This can go.
Signed-off-by: nigel brown <[email protected]>
jerm-dro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect something is off with the diff here. I'm seeing some changes that I don't expect in this PR.
| // SPDX-FileCopyrightText: Copyright 2025 Stacklok, Inc. | ||
| // SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| package controllers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These diffs don't seem like they belong in this commit. Can you remove them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests wouldn't pass without them. Is there another choice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, these spdx identifiers can't really co-exist with another license at the top.
| BUILD_DATE: '{{dateInZone "2006-01-02T15:04:05Z" (now) "UTC"}}' | ||
| cmds: | ||
| - go install -ldflags "-s -w -X github.com/stacklok/toolhive/pkg/versions.Version={{.VERSION}} -X github.com/stacklok/toolhive/pkg/versions.Commit={{.COMMIT}} -X github.com/stacklok/toolhive/pkg/versions.BuildDate={{.BUILD_DATE}}" -v ./cmd/vmcp | ||
| - go install -tags="fts5" -ldflags "-s -w -X github.com/stacklok/toolhive/pkg/versions.Version={{.VERSION}} -X github.com/stacklok/toolhive/pkg/versions.Commit={{.COMMIT}} -X github.com/stacklok/toolhive/pkg/versions.BuildDate={{.BUILD_DATE}}" -v ./cmd/vmcp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the build fail if you don't include this? Or will problems only be encountered at runtime?
I ask because you had to add it to a lot of places. If you missed one, I wouldn't want that to result in runtime failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is an optional sqlite module. If it isn't there there will be a runtime "no such module: fts5" and it will crash. It is needed for BM25 searches.
I can't see a way round including this (can you?).
Reinstated the deleted tests in pkg/vmcp/schema/reflect_test.go that were removed in commit 38cf21e. Updated the tests to work with the current optimizer implementation by: - Creating FindToolInput and CallToolInput test types that match the current optimizer tool schemas (optim_find_tool and optim_call_tool) - Updating tests to reflect current schema (tool_keywords as string instead of array, added limit and backend_id fields) - All tests now pass and validate schema generation and translation functions work correctly with optimizer tool inputs
Remove the EmbeddingService field from OptimizerConfig and all related conversion logic. Users should now provide the full service URL including port for in-cluster services (e.g., http://service-name.namespace.svc.cluster.local:port). This simplifies the codebase by removing Kubernetes-specific service resolution logic and making the configuration more explicit and platform-agnostic.
- Restore pkg/vmcp/server/adapter/optimizer_adapter.go with original structure - Use optim_ prefix for tool names (optim_find_tool, optim_call_tool) - Remove router check for optim_ prefix (optimizer tools don't go through router) - Eliminate schema duplication by defining schemas once in optimizer_adapter.go - Update server to use adapter.CreateOptimizerTools() directly - Remove obsolete EmbeddingService references from commands.go - Fix .gitignore pattern to avoid ignoring vmcp source files
Merge jerm/2026-01-13-optimizer-in-vmcp into main
This PR merges 17 commits that integrate the MCP optimizer into vMCP, adding semantic tool discovery, observability, Kubernetes support, and various bug fixes and improvements.
Core Optimizer Integration
Add Optimizer Package (#3253)
Add Optimizer Integration Endpoints (#3318)
find_toolandcall_toolendpoints to vMCP optimizerResolve Tool Names in optim.find_tool (#3337)
Observability & Metrics
Add Token Metrics and Observability (#3347)
Add OpenTelemetry Tracing to Capability Aggregation
AggregateCapabilities(parent span)QueryAllCapabilities(parallel backend queries)QueryCapabilities(per-backend queries)ResolveConflicts(conflict resolution)MergeCapabilities(final merge)Kubernetes Integration
Add Dynamic/Static Mode Support (#3235)
Add DeepCopy and Kubernetes Service Resolution
DeepCopy()for automatic passthrough of config fields (Optimizer, Metadata, etc.)resolveEmbeddingService()to resolve Kubernetes Service names to URLsKubernetes Optimizer Integration Fixes (#3359)
WithContinuousListening()to use timeout-based approachTesting & Reliability Improvements
Run API E2E Test Server as Standalone Process (#3356)
Fix Flaky E2E Tests
checkPodsReadyto prevent flaky test failuresFix Unrecognized Dotty Names
Infrastructure
Bump Operator CRDs Chart Version
Documentation Updates
Summary
This PR consolidates the complete integration of the MCP optimizer into vMCP, enabling semantic tool discovery, reducing token usage, and providing comprehensive observability. The integration includes full Kubernetes support, robust error handling, and improved test reliability.
Related PRs
Large PR Justification