-
Notifications
You must be signed in to change notification settings - Fork 171
Merge jerm/2026-01-13-optimizer-in-vmcp into main #3373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
fde2423
51ded21
d3c2e53
f74e63d
00fff43
bf77813
88c642c
7b30e92
024c066
5c72e9d
7a9fba5
9edcc7f
5a473f1
c23131b
12c8162
38cf21e
9c8ec1b
bc1f4ca
0310bda
506f053
77b2f64
91a210d
47c811b
437a8d4
5cc4fc0
67d20fd
f4ea6c7
6d72750
c49f1c4
487e412
f20ad52
e65f8cc
67516bf
7e76451
48652d5
465bb69
350f9f6
55f03db
0aa6751
ee64b7d
e955e40
6d4af12
1238056
c5a7c0e
0b1633f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a simple change to allow the pull policy to be passed through to the MCP server from the operator. This is useful mostly for development (although, there could be other cases where shuffling images around) when you build a file and you load it into the local kind registry. Without this, the pods will try to pull the images. This could be a separate PR. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,134 @@ | ||
| # Integrating Optimizer with vMCP | ||
therealnb marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ## Overview | ||
|
|
||
| The optimizer package ingests MCP server and tool metadata into a searchable database with semantic embeddings. This enables intelligent tool discovery and token optimization for LLM consumption. | ||
|
|
||
| ## Integration Approach | ||
|
|
||
| **Event-Driven Ingestion**: The optimizer integrates directly with vMCP's startup process. When vMCP starts and loads its configured servers, it calls the optimizer to ingest each server's metadata and tools. | ||
|
|
||
| ❌ **NOT** a separate polling service discovering backends | ||
| ✅ **IS** called directly by vMCP during server initialization | ||
|
|
||
| ## How It Is Integrated | ||
|
|
||
| The optimizer is already integrated into vMCP and works automatically when enabled via configuration. Here's how the integration works: | ||
|
|
||
| ### Initialization | ||
|
|
||
| When vMCP starts with optimizer enabled in the configuration, it: | ||
|
|
||
| 1. Initializes the optimizer database (chromem-go + SQLite FTS5) | ||
| 2. Configures the embedding backend (placeholder, Ollama, or vLLM) | ||
| 3. Sets up the ingestion service | ||
|
|
||
| ### Automatic Ingestion | ||
|
|
||
| The optimizer integrates with vMCP's `OnRegisterSession` hook, which is called whenever: | ||
|
|
||
| - vMCP starts and loads configured MCP servers | ||
| - A new MCP server is dynamically added | ||
| - A session reconnects or refreshes | ||
|
|
||
| When this hook is triggered, the optimizer: | ||
|
|
||
| 1. Retrieves the server's metadata and tools via MCP protocol | ||
| 2. Generates embeddings for searchable content | ||
| 3. Stores the data in both the vector database (chromem-go) and FTS5 database | ||
| 4. Makes the tools immediately available for semantic search | ||
|
|
||
| ### Exposed Tools | ||
|
|
||
| When the optimizer is enabled, vMCP automatically exposes these tools to LLM clients: | ||
|
|
||
| - `optim.find_tool`: Semantic search for tools across all registered servers | ||
| - `optim.call_tool`: Dynamic tool invocation after discovery | ||
|
|
||
| ### Implementation Location | ||
|
|
||
| The integration code is located in: | ||
| - `cmd/vmcp/optimizer.go`: Optimizer initialization and configuration | ||
| - `pkg/vmcp/optimizer/optimizer.go`: Session registration hook implementation | ||
| - `cmd/thv-operator/pkg/optimizer/ingestion/service.go`: Core ingestion service | ||
|
|
||
| ## Configuration | ||
|
|
||
| Add optimizer configuration to vMCP's config: | ||
|
|
||
| ```yaml | ||
| # vMCP config | ||
| optimizer: | ||
| enabled: true | ||
| db_path: /data/optimizer.db | ||
| embedding: | ||
| backend: vllm # or "ollama" for local dev, "placeholder" for testing | ||
| url: http://vllm-service:8000 | ||
| model: sentence-transformers/all-MiniLM-L6-v2 | ||
| dimension: 384 | ||
| ``` | ||
|
|
||
| ## Error Handling | ||
|
|
||
| **Important**: Optimizer failures should NOT break vMCP functionality: | ||
|
|
||
| - ✅ Log warnings if optimizer fails | ||
| - ✅ Continue server startup even if ingestion fails | ||
| - ✅ Run ingestion in goroutines to avoid blocking | ||
| - ❌ Don't fail server startup if optimizer is unavailable | ||
|
|
||
| ## Benefits | ||
|
|
||
| 1. **Automatic**: Servers are indexed as they're added to vMCP | ||
| 2. **Up-to-date**: Database reflects current vMCP state | ||
| 3. **No polling**: Event-driven, efficient | ||
| 4. **Semantic search**: Enables intelligent tool discovery | ||
| 5. **Token optimization**: Tracks token usage for LLM efficiency | ||
|
|
||
| ## Testing | ||
|
|
||
| ```go | ||
| func TestOptimizerIntegration(t *testing.T) { | ||
| // Initialize optimizer | ||
| optimizerSvc, err := ingestion.NewService(&ingestion.Config{ | ||
| DBConfig: &db.Config{Path: "/tmp/test-optimizer.db"}, | ||
| EmbeddingConfig: &embeddings.Config{ | ||
| BackendType: "ollama", | ||
| BaseURL: "http://localhost:11434", | ||
| Model: "all-minilm", | ||
| Dimension: 384, | ||
| Dimension: 384, | ||
| }, | ||
| }) | ||
| require.NoError(t, err) | ||
| defer optimizerSvc.Close() | ||
|
|
||
| // Simulate vMCP starting a server | ||
| ctx := context.Background() | ||
| tools := []mcp.Tool{ | ||
| {Name: "get_weather", Description: "Get current weather"}, | ||
| {Name: "get_forecast", Description: "Get weather forecast"}, | ||
| } | ||
|
|
||
| err = optimizerSvc.IngestServer( | ||
| ctx, | ||
| "weather-001", | ||
| "weather-service", | ||
| "http://weather.local", | ||
| models.TransportSSE, | ||
| ptr("Weather information service"), | ||
| tools, | ||
| ) | ||
| require.NoError(t, err) | ||
|
|
||
| // Verify ingestion | ||
| server, err := optimizerSvc.GetServer(ctx, "weather-001") | ||
| require.NoError(t, err) | ||
| assert.Equal(t, "weather-service", server.Name) | ||
| } | ||
| ``` | ||
|
|
||
| ## See Also | ||
|
|
||
| - [Optimizer Package README](./README.md) - Package overview and API | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to stop me checking these in by accident. This can go.