Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
fde2423
Update vmcp/README
jerm-dro Jan 13, 2026
51ded21
feat: Add optimizer package with semantic tool discovery and ingestio…
therealnb Jan 15, 2026
d3c2e53
feat: Add optimizer integration endpoints and tool discovery (#3318)
therealnb Jan 19, 2026
f74e63d
fix: Resolve tool names in optim.find_tool to match routing table (#3…
therealnb Jan 19, 2026
00fff43
Add token metrics and observability to optimizer integration (#3347)
therealnb Jan 20, 2026
bf77813
fix: Bump operator-crds chart version to 0.0.97 after rebase
therealnb Jan 20, 2026
88c642c
fix: Skip completed pods in checkPodsReady to prevent flaky e2e test …
therealnb Jan 20, 2026
7b30e92
fix: Add pod readiness checks before health endpoint verification
therealnb Jan 20, 2026
024c066
fix: Add HTTP client timeout to health check in flaky e2e test
therealnb Jan 20, 2026
5c72e9d
Add dynamic/static mode support to VirtualMCPServer operator (#3235)
yrobla Jan 20, 2026
7a9fba5
fix: Resolve tool names in optim.find_tool to match routing table (#3…
therealnb Jan 19, 2026
9edcc7f
feat: Add DeepCopy and Kubernetes service resolution for optimizer co…
therealnb Jan 20, 2026
5a473f1
fix: Add remaining Kubernetes optimizer integration fixes from PR #3359
therealnb Jan 20, 2026
c23131b
Add OpenTelemetry tracing to capability aggregation
therealnb Jan 21, 2026
12c8162
Fix unrecognized dotty names
therealnb Jan 21, 2026
38cf21e
Fix failed CI checks: remove broken optimizer adapter files and fix v…
therealnb Jan 21, 2026
9c8ec1b
Add license headers to all optimizer package files
therealnb Jan 21, 2026
bc1f4ca
Fix linting errors: gci formatting, gocyclo complexity, lll line length
therealnb Jan 21, 2026
0310bda
Add license headers to remaining files missing SPDX headers
therealnb Jan 21, 2026
506f053
Fix CI check failures: license headers, linting, and tests
therealnb Jan 21, 2026
77b2f64
Refactor HybridSearchRatio from float64 to int percentage
therealnb Jan 21, 2026
91a210d
demo scripts
therealnb Jan 21, 2026
47c811b
More spdx fails
therealnb Jan 21, 2026
437a8d4
Merge branch 'main' into jerm/2026-01-13-optimizer-in-vmcp
jerm-dro Jan 21, 2026
5cc4fc0
Restore schema tests for optimizer tool inputs
therealnb Jan 22, 2026
67d20fd
Remove EmbeddingService field, simplify to use only EmbeddingURL
therealnb Jan 22, 2026
f4ea6c7
Restore optimizer adapter pattern and remove router check
therealnb Jan 22, 2026
6d72750
Move optimizer package to cmd/thv-operator/pkg/optimizer
therealnb Jan 22, 2026
c49f1c4
Revert excessive spdx license changes
therealnb Jan 22, 2026
487e412
Fix linting issues and add optimizer adapter tests
therealnb Jan 22, 2026
f20ad52
Merge branch 'main' into jerm/2026-01-13-optimizer-in-vmcp
therealnb Jan 22, 2026
e65f8cc
Allow % for hybridSearchRatio
therealnb Jan 22, 2026
67516bf
Fix linting issues: Go imports, staticcheck, and Helm chart docs
therealnb Jan 22, 2026
7e76451
Regenerate CRDs and CRD docs after recent changes
therealnb Jan 22, 2026
48652d5
Removing collateral changes after review.
therealnb Jan 23, 2026
465bb69
Merge remote-tracking branch 'origin/main' into jerm/2026-01-13-optim…
therealnb Jan 23, 2026
350f9f6
This is required
therealnb Jan 23, 2026
55f03db
Merge branch 'main' into jerm/2026-01-13-optimizer-in-vmcp
therealnb Jan 23, 2026
0aa6751
Fix optimizer mode to expose only find_tool and call_tool
therealnb Jan 23, 2026
ee64b7d
Refactor optimizer integration to be more modular
therealnb Jan 23, 2026
e955e40
Add back OnRegisterSession method for test compatibility
therealnb Jan 23, 2026
6d4af12
Fix code formatting
therealnb Jan 23, 2026
1238056
Refactor: Move OptimizerIntegration interface to optimizer package
therealnb Jan 23, 2026
c5a7c0e
Update mock file: remove OptimizerIntegration mock
therealnb Jan 23, 2026
0b1633f
Refactor: Remove server.OptimizerConfig, move conversion to optimizer…
therealnb Jan 23, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -42,4 +42,11 @@ cmd/thv-operator/.task/checksum/crdref-gen
# Test coverage
coverage*

crd-helm-wrapper
crd-helm-wrapper
cmd/vmcp/__debug_bin*

# Demo files
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to stop me checking these in by accident. This can go.

examples/operator/virtual-mcps/vmcp_optimizer.yaml
scripts/k8s_vmcp_optimizer_demo.sh
examples/ingress/mcp-servers-ingress.yaml
/vmcp
2 changes: 2 additions & 0 deletions .golangci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,7 @@ linters:
- third_party$
- builtin$
- examples$
- scripts$
formatters:
enable:
- gci
Expand All @@ -155,3 +156,4 @@ formatters:
- third_party$
- builtin$
- examples$
- scripts$
26 changes: 20 additions & 6 deletions cmd/thv-operator/controllers/mcpserver_controller.go
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a simple change to allow the pull policy to be passed through to the MCP server from the operator. This is useful mostly for development (although, there could be other cases where shuffling images around) when you build a file and you load it into the local kind registry.

Without this, the pods will try to pull the images.

This could be a separate PR.

Original file line number Diff line number Diff line change
Expand Up @@ -1250,12 +1250,13 @@ func (r *MCPServerReconciler) deploymentForMCPServer(
Spec: corev1.PodSpec{
ServiceAccountName: ctrlutil.ProxyRunnerServiceAccountName(m.Name),
Containers: []corev1.Container{{
Image: getToolhiveRunnerImage(),
Name: "toolhive",
Args: args,
Env: env,
VolumeMounts: volumeMounts,
Resources: resources,
Image: getToolhiveRunnerImage(),
Name: "toolhive",
ImagePullPolicy: getImagePullPolicyForToolhiveRunner(),
Args: args,
Env: env,
VolumeMounts: volumeMounts,
Resources: resources,
Ports: []corev1.ContainerPort{{
ContainerPort: m.GetProxyPort(),
Name: "http",
Expand Down Expand Up @@ -1813,6 +1814,19 @@ func getToolhiveRunnerImage() string {
return image
}

// getImagePullPolicyForToolhiveRunner returns the appropriate imagePullPolicy for the toolhive runner container.
// If the image is a local image (starts with "kind.local/" or "localhost/"), use Never.
// Otherwise, use IfNotPresent to allow pulling when needed but avoid unnecessary pulls.
func getImagePullPolicyForToolhiveRunner() corev1.PullPolicy {
image := getToolhiveRunnerImage()
// Check if it's a local image that should use Never
if strings.HasPrefix(image, "kind.local/") || strings.HasPrefix(image, "localhost/") {
return corev1.PullNever
}
// For other images, use IfNotPresent to allow pulling when needed
return corev1.PullIfNotPresent
}

// handleExternalAuthConfig validates and tracks the hash of the referenced MCPExternalAuthConfig.
// It updates the MCPServer status when the external auth configuration changes.
func (r *MCPServerReconciler) handleExternalAuthConfig(ctx context.Context, m *mcpv1alpha1.MCPServer) error {
Expand Down
134 changes: 134 additions & 0 deletions cmd/thv-operator/pkg/optimizer/INTEGRATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# Integrating Optimizer with vMCP

## Overview

The optimizer package ingests MCP server and tool metadata into a searchable database with semantic embeddings. This enables intelligent tool discovery and token optimization for LLM consumption.

## Integration Approach

**Event-Driven Ingestion**: The optimizer integrates directly with vMCP's startup process. When vMCP starts and loads its configured servers, it calls the optimizer to ingest each server's metadata and tools.

❌ **NOT** a separate polling service discovering backends
✅ **IS** called directly by vMCP during server initialization

## How It Is Integrated

The optimizer is already integrated into vMCP and works automatically when enabled via configuration. Here's how the integration works:

### Initialization

When vMCP starts with optimizer enabled in the configuration, it:

1. Initializes the optimizer database (chromem-go + SQLite FTS5)
2. Configures the embedding backend (placeholder, Ollama, or vLLM)
3. Sets up the ingestion service

### Automatic Ingestion

The optimizer integrates with vMCP's `OnRegisterSession` hook, which is called whenever:

- vMCP starts and loads configured MCP servers
- A new MCP server is dynamically added
- A session reconnects or refreshes

When this hook is triggered, the optimizer:

1. Retrieves the server's metadata and tools via MCP protocol
2. Generates embeddings for searchable content
3. Stores the data in both the vector database (chromem-go) and FTS5 database
4. Makes the tools immediately available for semantic search

### Exposed Tools

When the optimizer is enabled, vMCP automatically exposes these tools to LLM clients:

- `optim.find_tool`: Semantic search for tools across all registered servers
- `optim.call_tool`: Dynamic tool invocation after discovery

### Implementation Location

The integration code is located in:
- `cmd/vmcp/optimizer.go`: Optimizer initialization and configuration
- `pkg/vmcp/optimizer/optimizer.go`: Session registration hook implementation
- `cmd/thv-operator/pkg/optimizer/ingestion/service.go`: Core ingestion service

## Configuration

Add optimizer configuration to vMCP's config:

```yaml
# vMCP config
optimizer:
enabled: true
db_path: /data/optimizer.db
embedding:
backend: vllm # or "ollama" for local dev, "placeholder" for testing
url: http://vllm-service:8000
model: sentence-transformers/all-MiniLM-L6-v2
dimension: 384
```

## Error Handling

**Important**: Optimizer failures should NOT break vMCP functionality:

- ✅ Log warnings if optimizer fails
- ✅ Continue server startup even if ingestion fails
- ✅ Run ingestion in goroutines to avoid blocking
- ❌ Don't fail server startup if optimizer is unavailable

## Benefits

1. **Automatic**: Servers are indexed as they're added to vMCP
2. **Up-to-date**: Database reflects current vMCP state
3. **No polling**: Event-driven, efficient
4. **Semantic search**: Enables intelligent tool discovery
5. **Token optimization**: Tracks token usage for LLM efficiency

## Testing

```go
func TestOptimizerIntegration(t *testing.T) {
// Initialize optimizer
optimizerSvc, err := ingestion.NewService(&ingestion.Config{
DBConfig: &db.Config{Path: "/tmp/test-optimizer.db"},
EmbeddingConfig: &embeddings.Config{
BackendType: "ollama",
BaseURL: "http://localhost:11434",
Model: "all-minilm",
Dimension: 384,
Dimension: 384,
},
})
require.NoError(t, err)
defer optimizerSvc.Close()

// Simulate vMCP starting a server
ctx := context.Background()
tools := []mcp.Tool{
{Name: "get_weather", Description: "Get current weather"},
{Name: "get_forecast", Description: "Get weather forecast"},
}

err = optimizerSvc.IngestServer(
ctx,
"weather-001",
"weather-service",
"http://weather.local",
models.TransportSSE,
ptr("Weather information service"),
tools,
)
require.NoError(t, err)

// Verify ingestion
server, err := optimizerSvc.GetServer(ctx, "weather-001")
require.NoError(t, err)
assert.Equal(t, "weather-service", server.Name)
}
```

## See Also

- [Optimizer Package README](./README.md) - Package overview and API

Loading