-
Notifications
You must be signed in to change notification settings - Fork 171
Open
Labels
bugSomething isn't workingSomething isn't workingkubernetesItems related to KubernetesItems related to Kubernetesoperator
Description
Bug Description
When manually scaling a StatefulSet replicas (e.g., increasing from 1 to 3), the ToolHive operator automatically reverts the replica count back to 1. This behavior prevents horizontal scaling of MCP servers.
Steps to Reproduce
- Deploy an MCP server via the ToolHive operator (creates a StatefulSet with 1 replica)
- Manually scale the StatefulSet:
kubectl scale statefulset <mcpserver-name> --replicas=3
- Observe that the operator reverts the replicas back to 1
Expected Behavior
The operator should NOT automatically revert manual scaling changes. The manually set replica count should persist.
Actual Behavior
The operator overrides the manual scaling and resets replicas to 1.
Root Cause
The MCPServer CRD lacks a replicas field to persist the desired replica state. Without this field, the operator has no way to know whether the replica count was intentionally changed, causing it to revert to its default state.
Proposed Solution
- Add a
replicasfield to the MCPServer CRD spec to allow users to declare the desired replica count - Update the operator to respect this field and not override manual scaling changes
- The field should be optional with a default value of 1 for backward compatibility
Example:
apiVersion: toolhive.stacklok.dev/v1alpha1
kind: MCPServer
metadata:
name: my-mcp-server
spec:
replicas: 3 # New field
image: my-image:latest
# ... other fieldsAdditional Context
- For scaling purposes, typically only one proxy/runner is needed since it routes to a headless service and can load balance between pods in the StatefulSet
- This is especially relevant for stateless MCP servers or Streamable HTTP MCP servers where load balancing works well
- For stateful MCP servers, scaling considerations may be more complex
Environment
- ToolHive Operator version: v0.6.12
- Kubernetes version: v1.33.3+k3s1
Related discussion: The community has confirmed this is a bug and the operator should NOT revert manual scaling changes.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingkubernetesItems related to KubernetesItems related to Kubernetesoperator