Skip to content

[BUG] vmcp: Enable Admission Webhooks for Kubernetes Operator #3360

@jerm-dro

Description

@jerm-dro

Summary

The ToolHive Kubernetes operator has admission webhook code for validating VirtualMCPServer, VirtualMCPCompositeToolDefinition, and MCPExternalAuthConfig resources, but these webhooks have never been functional. The controller-runtime v0.23.0 upgrade exposed this issue.

Background

What Happened

During the upgrade to controller-runtime v0.23.0, the operator began failing at startup with:

"error":"open /tmp/k8s-webhook-server/serving-certs/tls.crt: no such file or directory"

Root Cause Analysis

Investigation revealed that the webhooks were never actually working:

  1. In controller-runtime v0.22.x: The old webhook API (ctrl.NewWebhookManagedBy(mgr).For(r).Complete()) silently failed to register webhooks. The webhook server never started because no webhooks were registered with it.

  2. In controller-runtime v0.23.0: The new generic webhook API (builder.WebhookManagedBy[T](mgr, r).WithValidator(r).Complete()) properly registers webhooks, which triggers the webhook server to start, which then fails because TLS certificates are not available.

Missing Infrastructure

Even if the webhook server started, the webhooks would not function because:

Component Required Status
ValidatingWebhookConfiguration Not deployed by helm chart
Webhook Service Not deployed by helm chart
Port 9443 exposed Not in deployment spec
TLS certificates No cert-manager integration

The config/webhook/manifests.yaml file exists (kubebuilder-generated) but is never deployed.

Impact of Missing Webhooks

The webhooks perform validation-only (no mutation). Without them:

Resource Webhook Validation Controller Validation Risk
VirtualMCPServer Disabled Partial (during reconcile) Low - caught at reconcile
MCPExternalAuthConfig Disabled None High - invalid configs silently accepted
VirtualMCPCompositeToolDefinition Disabled None High - invalid configs silently accepted

Example Validations Not Enforced

MCPExternalAuthConfig:

  • Can create tokenExchange type without required tokenExchange config
  • Can set conflicting configs (both tokenExchange and headerInjection)
  • Unsupported auth types are accepted

VirtualMCPServer:

  • Missing required spec.config.groupRef (caught at reconcile, but not at admission)
  • Invalid auth configurations

Proposed Solution

Option 1: Full Webhook Support (Recommended for Production)

  1. Add cert-manager as a dependency or optional integration
  2. Deploy ValidatingWebhookConfiguration via helm chart
  3. Create webhook Service in helm chart
  4. Expose port 9443 in deployment
  5. Configure cert-manager Certificate resource

Option 2: Self-Signed Certificates (Development/Simple Deployments)

  1. Generate self-signed certificates at operator startup
  2. Mount emptyDir volume for certificate storage
  3. Deploy ValidatingWebhookConfiguration with caBundle injection
  4. Create webhook Service

Option 3: Keep Webhooks Disabled (Current State)

  1. Document that webhooks are not functional
  2. Add controller-level validation for MCPExternalAuthConfig and VirtualMCPCompositeToolDefinition
  3. Accept that invalid resources can be created (will fail at runtime)

Current Workaround

Webhook registration has been disabled in cmd/thv-operator/main.go to allow the operator to start. The webhook server is not created.

References

  • controller-runtime v0.23.0 breaking change: Generic Validator and Defaulter
  • Webhook manifest location: config/webhook/manifests.yaml
  • Affected files:
    • cmd/thv-operator/main.go
    • cmd/thv-operator/api/v1alpha1/*_webhook.go

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingoperatorvmcpVirtual MCP Server related issues

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions