Skip to content

Pods requesting vpc.amazonaws.com/pod-eni cannot be scheduled - allocatable not set #637

@jfernandez

Description

@jfernandez

Describe the Bug:
The VPC resource controller advertises extended resource capacity for branch ENIs (vpc.amazonaws.com/pod-eni) by patching node.Status.Capacity. However, node.Status.Allocatable remains unset, preventing the Kubernetes scheduler from scheduling pods that request these extended resources.

Observed Behavior:
When the controller advertises branch ENI capacity as an extended resource:

  • node.Status.Capacity["vpc.amazonaws.com/pod-eni"] is set to the correct value by the controller
  • node.Status.Allocatable["vpc.amazonaws.com/pod-eni"] remains unset
  • Pods requesting vpc.amazonaws.com/pod-eni cannot be scheduled because the scheduler only considers allocatable resources

Root Cause Analysis:
The VPC Resource Controller only patches Capacity, not Allocatable. This can be verified in the code:

  • pkg/k8s/wrapper.go:185 - AdvertiseCapacityIfNotSet only sets node.Status.Capacity
  • pkg/provider/branch/provider.go:267 - Branch provider calls this function

According to Kubernetes design, the kubelet automatically calculates Allocatable from Capacity during periodic status sync (every ~10 seconds by default). For extended resources, allocatable = capacity since no system reservations apply.

However, there appears to be a timing issue where:

  1. The controller patches Capacity after node initialization
  2. The kubelet's periodic sync may not properly populate Allocatable from the externally-patched Capacity
  3. This could be due to strategic merge patch behavior or the order of operations in kubelet's node status update logic
  4. Our specific kubelet configuration may be causing this issue

Expected Behavior:
The scheduler should be able to schedule pods requesting vpc.amazonaws.com/pod-eni resources. For this to work, node.Status.Allocatable["vpc.amazonaws.com/pod-eni"] must be set to a non-zero value matching the capacity.

How to reproduce it (as minimally and precisely as possible):

  1. Enable Security Groups for Pods (ENABLE_POD_ENI=true)
  2. Wait for the controller to advertise branch ENI capacity
  3. Check node status: kubectl get node <node-name> -o json | jq '.status.capacity, .status.allocatable'
  4. Observe that capacity is set but allocatable is not
  5. Create a pod requesting vpc.amazonaws.com/pod-eni
  6. Pod remains in Pending state with event: "0/N nodes are available: N Insufficient vpc.amazonaws.com/pod-eni"

Workaround:
Manually patch allocatable:

kubectl patch node <node-name> --subresource=status --type=merge -p '{"status":{"allocatable":{"vpc.amazonaws.com/pod-eni":"9"}}}'

Additional Context:
While the kubelet is designed to automatically calculate allocatable from capacity, this is not happening reliably in our stack.

Environment:

  • Kubernetes version: 1.30+
  • VPC Resource Controller Version: 1.7.x
  • OS: Custom Ubuntu AMI on EKS

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions