Add scheduling options and strategies in CA#9205
Add scheduling options and strategies in CA#9205walidghallab wants to merge 2 commits intokubernetes:masterfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: walidghallab The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @walidghallab. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
6155035 to
73eefa2
Compare
|
/cc x13n |
|
/uncc elmiko |
|
/uncc aleksandra-malinowska |
73eefa2 to
62531e3
Compare
|
/ok-to-test |
62531e3 to
ad16378
Compare
27ab375 to
526fd85
Compare
cluster-autoscaler/simulator/clustersnapshot/scheduling_opts.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/simulator/clustersnapshot/scheduling_opts.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/simulator/clustersnapshot/predicate/plugin_runner.go
Outdated
Show resolved
Hide resolved
3901633 to
8dd10e5
Compare
a831585 to
a4606c0
Compare
To be more exact: - An iterator to control the order of nodes to go through. - Prefer the smallest number of iterations in case of multiple matches. - Add a way of exiting early if no more nodes should be checked.
a4606c0 to
6427281
Compare
| // NodeOrderMapping defines the order in which nodes are iterated during scheduling simulation. | ||
| type NodeOrderMapping interface { | ||
| // Init initializes the mapping with the list of nodes and the index of the last successful match. | ||
| Init(collection []*framework.NodeInfo, lastMatch int) |
There was a problem hiding this comment.
I wonder if we could avoid passing lastMatch in the argument here to avoid leaking abstractions. I understand the current implementation starts from last match + 1, but if we changed semantics to last accessed + 1 instead, you could encapsulate last index logic entirely in lastIndexOrderMapping rather than splitting the responsibility between that and RunFiltersUntilPassingNode. The point of last match + 1 is to prevent starting from the same node every time and that property is going to be maintained if you change the semantic to be last accesssed + 1.
There was a problem hiding this comment.
The thing is, I want to allow having 2 strategies:
- One that starts from lastMatch + 1 (current).
- One that starts from lastMatch (this can be useful in the ASN bug we found for example). This is highly efficient while limiting spreading.
So the current way is more flexible and allow both.
There was a problem hiding this comment.
Makes sense. I wonder though whether in this case there should be a separate MarkLastMatch(int) method in the interface. Then you could drop lastIndex field from plugin runner - no need for it to be stored there.
There was a problem hiding this comment.
I see what you mean. But in that case lastIndex won't be communicated between 2 different NodeOrderings. And also will need some NodeOrderings to be global. (e.g. one that start from last node and one that start last node + 1).
Maybe it is okay but I wanted to have flexibility. Callers who don't need it (e.g. priorityNodeOrderMapping) can simply ignore it in init calls.
There was a problem hiding this comment.
Right, with the current code structure it is possible to sometimes pass one ordering and sometimes another. And you want to pass the information about last match between these orderings? Actually the whole last index idea is quite fragile, because you have to ensure the list of nodes returned from snapshot is identical between the calls. Do you have a use case for running the predicates twice for the same pod, on the same snapshot, but with different ordering?
There was a problem hiding this comment.
you have to ensure the list of nodes returned from snapshot is identical between the calls
True. The detailed answer will be:
- In basic snapshot, it is always changing (bec it is based on a map without cache AFAICS)
- In delta snapshot (which I assume cloud providers to be using), the answer it depends:
-- The part that comes from before Fork will always be at the beginning of the nodeInfoList and same order (due to cache in parent snapshot).
-- The whole list will be the same unless add/remove/modify operations are done (due to cache in the current snapshot).
Do you have a use case for running the predicates twice for the same pod, on the same snapshot, but with different ordering?
No. Bec once pod is matched, I assume it will be cached for the rest of the CA loop.
Actually the whole last index idea is quite fragile
I agree.
Okay I have made the changes:
- Removed lastIndex from Init.
- Added MarkMatch method to the interface.
- Added global DefaultNodeOrdering
- Renamed Init to Reset, bec Init indicates that it is only done once but in reality it is called every time a new pod is being scheduled.
I made them in a separate commit so you can compare the diff.
There was a problem hiding this comment.
Note, I made defaultNodeOrdering a field in the pluginRunner (not global one).
To avoid race conditions (e.g. when 2 snapshots are created and running in parallel).
- this way we can have the ability to modify the default one in the future for a specific cloud provider (or even for just specific instance of clustersnapshot) without affecting the others (e.g. by adding SetDefaultNodeOrdering or passing it in the constructor and/or having it in AutoscalingOptions).
85b4d8b to
59f6b78
Compare
To be more detailed, changes are: - Removed lastIndex from Init. - Added MarkMatch method to the interface. - Stored defaultNodeOrdering (didn't have global one to avoid race conditions between different clustersnapshots). - Renamed Init to Reset, bec Init indicates that it is only done once but in reality it is called every time a new pod is being scheduled.
59f6b78 to
6d86d3f
Compare
What type of PR is this?
/kind feature
What this PR does / why we need it:
Add scheduling options and strategies in CA.
To be more exact:
Why we need it? Because currently RunFiltersUntilPassingNode has only one scheduling strategy, which is to always start from the last node it tried. While this work in many cases, there are some cases where we don't want this behavior.
Adding this option is a way to:
The only alternative that exist right now is to run CheckPredicates N times, but this is very slow because:
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: