feat: support network interface configuration#9027
feat: support network interface configuration#9027
Conversation
cfc53f3 to
ecb1e4f
Compare
nathangeology
left a comment
There was a problem hiding this comment.
Hope these comments help. Let me know if you have any questions.
| LabelInstanceAcceleratorCount = apis.Group + "/instance-accelerator-count" | ||
| LabelNodeClass = apis.Group + "/ec2nodeclass" | ||
| LabelInstanceTenancy = apis.Group + "/instance-tenancy" | ||
| LabelEFACount = apis.Group + "/instance-efa-count" |
There was a problem hiding this comment.
The new label karpenter.k8s.aws/instance-efa-count is added to WellKnownLabels and excluded from the nodepool CEL validation tests (users can't set it as a requirement directly). But in computeRequirements(), it's initially set to DoesNotExist and then conditionally overwritten with the actual count when networkInterfaces != nil. This means when a pod requests EFA resources without NodeClass network interface config, the label won't be set via requirements — it only gets set on the node via instanceToNodeClaim in cloudprovider.go. That asymmetry between scheduling requirements and actual node labels could cause issues if someone tries to use the label as a scheduling constraint.
There was a problem hiding this comment.
Its a fair point. What I ran into was that there is no way for Karpenter to know about EFA configurations for instances provisioned without NodeClass network interface configurations (EFAs based on pod resource requests are done during scheduling time).
The way I approached this was to support the scheduling label for static EFA configurations (i.e. preconfigured network interface configuration). Although, now that I'm thinking about it a bit more I think it makes sense to drop support for this as a label supporting scheduling in Karpenter and just keep it around as a informative label. Thoughts @jmdeal?
For some context, the reason we are adding this label is to provide a path forward for fixing this issue - aws/eks-charts#1239
There was a problem hiding this comment.
What if we made the presence of the label trigger the dynamic EFA provisioning path? We would handle syncing the label to the NodeClaim in a similar way to how we handle syncing the capacity reservation labels - we'd only apply the label if it was explicitly requested via the NodeClaim requirements. This works around the existing limitation where we can't represent that an instance type may have a label with a given set of values, or it may not have that label.
There was a problem hiding this comment.
That makes a lot of sense, I think thats probably the best way to do it. Changed
| LabelInstanceAcceleratorCount = apis.Group + "/instance-accelerator-count" | ||
| LabelNodeClass = apis.Group + "/ec2nodeclass" | ||
| LabelInstanceTenancy = apis.Group + "/instance-tenancy" | ||
| LabelEFACount = apis.Group + "/instance-efa-count" |
There was a problem hiding this comment.
What if we made the presence of the label trigger the dynamic EFA provisioning path? We would handle syncing the label to the NodeClaim in a similar way to how we handle syncing the capacity reservation labels - we'd only apply the label if it was explicitly requested via the NodeClaim requirements. This works around the existing limitation where we can't represent that an instance type may have a label with a given set of values, or it may not have that label.
|
Preview deployment ready! Preview URL: https://pr-9027.d18coufmbnnaag.amplifyapp.com Built from commit |
de1c210 to
8a4ebc3
Compare
ryan-mist
left a comment
There was a problem hiding this comment.
/karpenter snapshot
|
Snapshot successfully published to |
ryan-mist
left a comment
There was a problem hiding this comment.
/karpenter snapshot
|
Snapshot successfully published to |
| InstanceStorePolicy *InstanceStorePolicy `json:"instanceStorePolicy,omitempty"` | ||
| // NetworkInterfaces specifies the network interface configurations to be attached to provisioned instances. | ||
| // +kubebuilder:validation:XValidation:message="networkInterfaces must not have duplicate networkCardIndex and deviceIndex pairs",rule="self.all(x, self.filter(y, x.networkCardIndex == y.networkCardIndex && x.deviceIndex == y.deviceIndex).size() == 1)" | ||
| // +kubebuilder:validation:XValidation:message="networkInterfaces must include a primary interface with interfaceType='interface'",rule="self.size() == 0 || self.exists(x, x.deviceIndex == 0 && x.networkCardIndex == 0 && x.interfaceType == 'interface')" |
There was a problem hiding this comment.
Does it always need to be device index 0 or does it just need to be on NC 0?
There was a problem hiding this comment.
Yeah, it has to be primary network interface (NC=0, DI=0)
From docs, theres a table with Can be used as primary network interface for instance and EFA-only is no - https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/efa.html, and then in docs:
A primary network interface has a device index of 0.
bbcdb60 to
27fffb3
Compare
ryan-mist
left a comment
There was a problem hiding this comment.
/karpenter snapshot
|
Snapshot successfully published to |
ryan-mist
left a comment
There was a problem hiding this comment.
/karpenter snapshot
|
Snapshot successfully published to |
| return false | ||
| } | ||
| // (3) the configured number of device indices for a network card is greater than what the instance offers | ||
| if lo.FromPtr(info.NetworkInfo.NetworkCards[nci].MaximumNetworkInterfaces) <= networkInterface.DeviceIndex { |
There was a problem hiding this comment.
can we look this up by field instead of by position, maybe by lo.Find()? I think the AWS API returns an index field, and I assume that's intentional. this is the kind of thing that would be a nightmare to debug on an exotic instance type.
| Expect(cloudProviderNodeClaim.Labels).To(HaveKey(v1.LabelEFACount)) | ||
| }) | ||
| It("shouldn't include vpc.amazonaws.com/efa on a nodeclaim if it doesn't request it", func() { | ||
| It("shouldn't include vpc.amazonaws.com/efa on a nodeclaim if it doesn't request it", func() { // FAILS |
There was a problem hiding this comment.
do we just need a separate label for "has the capability at all" and "the provisioned count of EFA adapters?" I'm guessing most pods fill up the nodes entirely so it won't come up much, but it would be good to make the design super clear about this
Fixes #N/A
Description
karpenter.k8s.aws/instance-efa-countwhen instances are launched with EFA devicesHow was this change tested?
Manual Tests
Drift
Static Provisioning
Dynamic Provisioning
Does this change impact docs?
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.