-
Notifications
You must be signed in to change notification settings - Fork 64
Description
Describe the Bug:
The VPC resource controller advertises extended resource capacity for branch ENIs (vpc.amazonaws.com/pod-eni) by patching node.Status.Capacity. However, node.Status.Allocatable remains unset, preventing the Kubernetes scheduler from scheduling pods that request these extended resources.
Observed Behavior:
When the controller advertises branch ENI capacity as an extended resource:
node.Status.Capacity["vpc.amazonaws.com/pod-eni"]is set to the correct value by the controllernode.Status.Allocatable["vpc.amazonaws.com/pod-eni"]remains unset- Pods requesting
vpc.amazonaws.com/pod-enicannot be scheduled because the scheduler only considers allocatable resources
Root Cause Analysis:
The VPC Resource Controller only patches Capacity, not Allocatable. This can be verified in the code:
pkg/k8s/wrapper.go:185-AdvertiseCapacityIfNotSetonly setsnode.Status.Capacitypkg/provider/branch/provider.go:267- Branch provider calls this function
According to Kubernetes design, the kubelet automatically calculates Allocatable from Capacity during periodic status sync (every ~10 seconds by default). For extended resources, allocatable = capacity since no system reservations apply.
However, there appears to be a timing issue where:
- The controller patches Capacity after node initialization
- The kubelet's periodic sync may not properly populate Allocatable from the externally-patched Capacity
- This could be due to strategic merge patch behavior or the order of operations in kubelet's node status update logic
- Our specific kubelet configuration may be causing this issue
Expected Behavior:
The scheduler should be able to schedule pods requesting vpc.amazonaws.com/pod-eni resources. For this to work, node.Status.Allocatable["vpc.amazonaws.com/pod-eni"] must be set to a non-zero value matching the capacity.
How to reproduce it (as minimally and precisely as possible):
- Enable Security Groups for Pods (
ENABLE_POD_ENI=true) - Wait for the controller to advertise branch ENI capacity
- Check node status:
kubectl get node <node-name> -o json | jq '.status.capacity, .status.allocatable' - Observe that capacity is set but allocatable is not
- Create a pod requesting
vpc.amazonaws.com/pod-eni - Pod remains in Pending state with event: "0/N nodes are available: N Insufficient vpc.amazonaws.com/pod-eni"
Workaround:
Manually patch allocatable:
kubectl patch node <node-name> --subresource=status --type=merge -p '{"status":{"allocatable":{"vpc.amazonaws.com/pod-eni":"9"}}}'Additional Context:
While the kubelet is designed to automatically calculate allocatable from capacity, this is not happening reliably in our stack.
Environment:
- Kubernetes version: 1.30+
- VPC Resource Controller Version: 1.7.x
- OS: Custom Ubuntu AMI on EKS