feat: add opt-in scheduling gates for queue capacity control by devzizu · Pull Request #5033 · volcano-sh/volcano

devzizu · 2026-02-08T14:29:16Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR adds opt-in scheduling gates for queue capacity control to prevent cluster autoscalers from triggering unnecessary scale-ups when pods are waiting for queue admission rather than cluster capacity.

When a pod opts in via the scheduling.volcano.sh/queue-allocation-gate: "true" annotation, Volcano adds a scheduling gate that delays the Unschedulable condition until the queue has capacity.

The design is further described in #4727.

Which issue(s) this PR fixes:

Fixes #4710

Special notes for your reviewer:

This feature is opt-in via pod annotation to ensure backward compatibility.
A simple E2E test was added to validate capacity reservation behavior, but further test cases may be required.

Does this PR introduce a user-facing change?

Add opt-in scheduling gates for queue capacity control. Pods can now use the scheduling.volcano.sh/queue-allocation-gate: "true" annotation to prevent false autoscaler scale-ups when waiting for queue admission.

volcano-sh-bot · 2026-02-08T14:29:22Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign shinytang6 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gemini-code-assist · 2026-02-08T14:29:42Z

Summary of Changes

Hello @devzizu, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical issue where Volcano's scheduler could cause unnecessary cluster autoscaler scale-ups by marking pods as 'Unschedulable' even when they were only waiting for queue capacity. By leveraging Kubernetes scheduling gates, it introduces an opt-in mechanism to delay the 'Unschedulable' condition until a pod is truly blocked by a lack of cluster resources, while also ensuring proper queue capacity accounting for pods that have passed initial admission but are awaiting node allocation.

Highlights

Opt-in Scheduling Gates for Queue Capacity Control: Introduces an opt-in mechanism using the scheduling.volcano.sh/queue-allocation-gate: "true" annotation. Pods with this annotation will have a Volcano-specific scheduling gate added by the MutatingAdmissionWebhook at creation time.
Asynchronous Gate Removal: The scheduler now asynchronously removes the Volcano scheduling gate from pods that fail predicate checks or prioritization (i.e., cannot find a suitable node). This signals to cluster autoscalers that a scale-up is needed, preventing false scale-ups when pods are merely waiting for queue admission.
Queue Capacity Reservation for Ungated Pods: Implements a new mechanism in the capacity plugin to reserve queue capacity for pods that have had their Volcano scheduling gate removed but are still pending. This prevents other pods from consuming the capacity intended for the ungated pod, addressing a potential race condition.
Gate Removal on Successful Bind: For pods that successfully pass all allocation checks and are bound to a node, the Volcano scheduling gate is removed during the binding operation.
Enhanced Scheduler Logic: Updates the scheduler's allocate action to manage the lifecycle of these scheduling gates, including skipping externally gated pods, queuing gate removals, and integrating with the new capacity reservation logic.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

docs/design/scheduling-gates-queue-admission.md
- Added a new design document detailing the motivation, proposal, and high-level implementation of gate-controlled scheduling for cluster autoscaler compatibility.
pkg/scheduler/actions/allocate/allocate.go
- Introduced asynchronous workers for scheduling gate removal to avoid blocking the main scheduler loop.
- Modified allocateResourcesForTasks to enqueue gate removal for pods that fail pre-predicate, predicate, or prioritization steps.
- Updated organizeJobWorksheet to only skip tasks with external scheduling gates, allowing Volcano-managed gates to be processed.
- Added Initialize and UnInitialize methods to manage the lifecycle of the async gate removal workers.
pkg/scheduler/api/helpers.go
- Added HasOnlyVolcanoSchedulingGate helper function to identify pods with only the Volcano-specific scheduling gate.
- Added HasQueueAllocationGateAnnotation helper function to check for the opt-in annotation.
pkg/scheduler/api/job_info.go
- Added RemoveGateDuringBind boolean field to TaskInfo to signal if a gate should be removed during the bind operation.
- Modified GetSchGatedPodResources to exclude pods with only the Volcano scheduling gate from resource deduction, ensuring they are counted in in-queue resources.
pkg/scheduler/api/types.go
- Defined CleanupReservationsFn type for functions that clean up reserved resources before a statement commit.
pkg/scheduler/cache/cache.go
- Updated the Bind method to remove the Volcano scheduling gate from a pod if task.RemoveGateDuringBind is true before performing the standard bind operation.
- Adjusted RecordJobStatusEvent to skip updating the PodScheduled condition if taskInfo.SchGated is true, respecting the scheduling gate's role.
pkg/scheduler/cache/util.go
- Added RemoveVolcanoSchGate function to programmatically remove the Volcano scheduling gate from a pod using a JSON patch.
- Added AddVolcanoSchGate function to programmatically add the Volcano scheduling gate to a pod using a JSON patch.
pkg/scheduler/framework/session.go
- Added cleanupReservationsFns map to store registered cleanup functions.
pkg/scheduler/framework/session_plugins.go
- Added AddCleanupReservationsFn to allow plugins to register cleanup logic.
- Implemented CleanupReservations method to execute all registered cleanup functions before a statement commit.
pkg/scheduler/framework/statement.go
- Added Name() and Task() methods to the operation struct for easier access to operation details.
pkg/scheduler/plugins/capacity/capacity.go
- Introduced queueGateReservedTasks cache to track pods that have passed capacity checks but are not yet allocated, reserving their queue capacity.
- Added addTaskToReservedCache, removeTaskFromReservedCache, and buildQueueReservedTasksCache methods to manage this reservation cache.
- Implemented queueAllocatableWithReserved to include reserved resources in capacity checks.
- Integrated the reservation cleanup mechanism by registering a CleanupReservationsFn in OnSessionOpen.
pkg/webhooks/admission/pods/mutate/mutate_pod.go
- Added patchSchedulingGates function to inject the scheduling.volcano.sh/queue-allocation-gate into a pod's spec.schedulingGates if the opt-in annotation is present.
staging/src/volcano.sh/apis/pkg/apis/scheduling/v1beta1/labels.go
- Defined QueueAllocationGateKey constant for the new scheduling gate annotation and name.
test/e2e/schedulingaction/allocate.go
- Added a new E2E test case to validate the queue capacity reservation behavior, demonstrating that an unschedulable pod with a removed gate reserves capacity and blocks other pods.
test/e2e/util/job.go
- Added Annotations field to TaskSpec to support setting annotations on pods created in E2E tests.
test/e2e/util/pod.go
- Added PodHasOnlyVolcanoSchedulingGate and HasSchedulingGatedCondition helper functions for E2E tests to check pod scheduling gate status.

Activity

The pull request introduces a new feature to address a known issue (Cluster Autoscaler node scale-up for Pods that exceed Queue's capability #4710) regarding cluster autoscaler compatibility.
A detailed design document (docs/design/scheduling-gates-queue-admission.md) was added to explain the motivation and implementation of this feature.
A simple E2E test was included to validate the new capacity reservation behavior.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a significant and well-designed feature to add opt-in scheduling gates for queue capacity control, improving compatibility with cluster autoscalers. While the overall design is comprehensive and includes a valuable E2E test, two significant security-related issues were identified: a data race in the asynchronous gate removal workers that could lead to a scheduler panic (Denial of Service), and a race condition in the JSON Patch logic for removing scheduling gates that could result in the unauthorized removal of security-related gates (Broken Access Control). Both of these critical security issues must be addressed before merging. Additionally, there are minor suggestions for a grammatical correction in the design document and replacing context.TODO() with context.Background() in background operations for improved context management.

pkg/scheduler/cache/util.go

pkg/scheduler/actions/allocate/allocate.go

docs/design/scheduling-gates-queue-admission.md

pkg/scheduler/actions/allocate/allocate.go

pkg/scheduler/cache/util.go

JesseStutler · 2026-02-10T09:36:45Z

I still have some opinions on the proposal and please take a look @devzizu #4727, thanks, we may need to furthur discuss it on the community meeting

Copilot

Pull request overview

This PR implements opt-in scheduling gates for queue capacity control to prevent cluster autoscalers from triggering unnecessary scale-ups when pods are waiting for queue admission rather than cluster capacity. The solution addresses issue #4710 by using Kubernetes scheduling gates to delay the Unschedulable condition until pods pass queue capacity checks.

Changes:

Added opt-in annotation scheduling.volcano.sh/queue-allocation-gate: "true" for pods to participate in gate-based queue admission control
Implemented asynchronous gate removal workers in the allocate action to remove gates when pods pass capacity checks
Enhanced capacity plugin with reserved task tracking to prevent race conditions where ungated-but-unscheduled pods could allow queue over-allocation
Added comprehensive E2E tests and design documentation

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
staging/src/volcano.sh/apis/pkg/apis/scheduling/v1beta1/labels.go	Added QueueAllocationGateKey constant for the opt-in annotation
pkg/scheduler/api/helpers.go	Added helper functions to check for Volcano scheduling gates and annotations
pkg/webhooks/admission/pods/mutate/mutate_pod.go	Added webhook logic to inject scheduling gates for opted-in pods
pkg/scheduler/actions/allocate/allocate.go	Implemented async worker infrastructure and gate removal logic in allocate action
pkg/scheduler/plugins/capacity/capacity.go	Added reserved task cache to account for ungated pods in capacity checks
pkg/scheduler/framework/statement.go	Added Name() and Task() methods to operation struct for cleanup callbacks
pkg/scheduler/framework/session.go	Added cleanupReservationsFns map to session
pkg/scheduler/framework/session_plugins.go	Added AddCleanupReservationsFn and CleanupReservations methods
pkg/scheduler/api/types.go	Added CleanupReservationsFn function type
pkg/scheduler/api/job_info.go	Modified GetSchGatedPodResources to exclude Volcano-only gated pods
pkg/scheduler/cache/util.go	Added RemoveVolcanoSchGate utility function for gate removal
pkg/scheduler/cache/cache.go	Added synchronous gate removal before bind as safety guarantee
test/e2e/schedulingaction/allocate.go	Added comprehensive E2E test for capacity reservation behavior
test/e2e/util/pod.go	Added helper functions for checking gate status in tests
test/e2e/util/job.go	Added Annotations field to TaskSpec for test configuration
pkg/scheduler/util/test_utils.go	Updated test utilities to handle SubGroupPolicy without NetworkTopology
pkg/scheduler/actions/allocate/allocate_test.go	Added unit test for SubGroupPolicy without NetworkTopology
docs/design/scheduling-gates-queue-admission.md	Added comprehensive design document explaining the feature
docs/user-guide/version-compatibility-archive.md	Added version compatibility archive document
README.md	Updated Kubernetes compatibility table

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pkg/scheduler/actions/allocate/allocate.go

pkg/scheduler/cache/util.go

pkg/webhooks/admission/pods/mutate/mutate_pod.go

pkg/scheduler/plugins/capacity/capacity.go

pkg/scheduler/actions/allocate/allocate.go

pkg/scheduler/cache/cache.go

test/e2e/util/pod.go

hajnalmt

This implementation is more than solid thank you @devzizu ! When Jesse comes back we can schedule a meeting.

Please squash the commits and let's keep the design doc in the separate PR!

pkg/scheduler/actions/allocate/allocate.go

hajnalmt · 2026-02-18T09:51:32Z

test/e2e/schedulingaction/allocate.go

 		gomega.Expect(err).NotTo(gomega.HaveOccurred())
 	})
+
+	ginkgo.It("Unschedulable pod with removed gate reserves queue capacity and blocks other pods", func() {


This is a really good test! it's so cool that it covers the webhook injection, the gate removal, the capacity reservation and reservation cleanup even in one go.

What we should add maybe as a test case too is that we don't remove other scheduling gates accidentally, like a pod with multiple scheduling gates one is the volcano one with annotation, it shouldn't schedule.

Added a new test case, can you please see if it fits the purpose? One part I'm not entirely happy with is the check that both gates remain and the pod stays Pending. Right now I'm using gomega.Consistently over 10 seconds (polling every second) to assert that the pod still has both gates and is Pending. It works, but it feels a bit arbitrary (fixed window and interval) and doesn't give a clear, deterministic guarantee that the scheduler has run and decided to skip the task.

pkg/scheduler/actions/allocate/allocate.go

hajnalmt

/ok-to-test

This is quite well written. I don't see a reason not to test it. Let's see.

Signed-off-by: devzizu <[email protected]>

…nsistency Signed-off-by: devzizu <[email protected]>

Signed-off-by: devzizu <[email protected]>

commit f4db07d Merge: 1000067 01823d1 Author: Volcano Bot <[email protected]> Date: Fri Feb 13 11:26:57 2026 +0800 Merge pull request volcano-sh#5039 from volcano-sh/copilot/update-kubernetes-compatibility-map Reorganize Kubernetes compatibility matrix: show 5 latest versions with newest first commit 01823d1 Author: Jesse Stutler <[email protected]> Date: Fri Feb 13 10:20:07 2026 +0800 Add version compatibility archive and update README with reference link commit 1000067 Merge: 6ba0e29 56c6901 Author: Volcano Bot <[email protected]> Date: Thu Feb 12 16:49:57 2026 +0800 Merge pull request volcano-sh#5038 from JesseStutler/fix_4871 Fixed issue where jobs with subgroups but not hard networkTopology.mode could not be scheduled. commit f22bea4 Author: copilot-swe-agent[bot] <[email protected]> Date: Thu Feb 12 08:42:26 2026 +0000 Reorganize Kubernetes compatibility table: show only 5 latest versions with newest on top and left Co-authored-by: JesseStutler <[email protected]> commit 56c6901 Author: JesseStutler <[email protected]> Date: Thu Feb 12 16:05:35 2026 +0800 Fixed issue where jobs with subgroups but not hard networkTopology.mode could not be scheduled. Signed-off-by: zhengchenyu <[email protected]> Signed-off-by: JesseStutler <[email protected]> commit 9125f3e Author: copilot-swe-agent[bot] <[email protected]> Date: Thu Feb 12 06:15:55 2026 +0000 Add Kubernetes compatibility for Volcano v1.13 and v1.14 Co-authored-by: JesseStutler <[email protected]> commit f4e0f66 Author: copilot-swe-agent[bot] <[email protected]> Date: Thu Feb 12 06:14:39 2026 +0000 Initial plan Signed-off-by: devzizu <[email protected]>

Signed-off-by: devzizu <[email protected]>

… values Signed-off-by: devzizu <[email protected]>

Signed-off-by: devzizu <[email protected]>

volcano-sh-bot added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 8, 2026

volcano-sh-bot requested a review from hajnalmt February 8, 2026 14:29

volcano-sh-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 8, 2026

volcano-sh-bot requested review from JesseStutler, huone1 and kevin-wangzefeng February 8, 2026 14:29

volcano-sh-bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Feb 8, 2026

devzizu force-pushed the ft-proposal-4710 branch from 1f0980d to 7d15238 Compare February 8, 2026 14:30

devzizu marked this pull request as ready for review February 8, 2026 14:31

volcano-sh-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 8, 2026

volcano-sh-bot requested review from alcorj-mizar, merryzhou and yuanchen8911 February 8, 2026 14:31

gemini-code-assist bot reviewed Feb 8, 2026

View reviewed changes

devzizu force-pushed the ft-proposal-4710 branch 2 times, most recently from 4077068 to 6e3165d Compare February 8, 2026 15:14

devzizu mentioned this pull request Feb 8, 2026

[Proposal] Gate-Controlled Scheduling for Cluster Autoscalers Compatibility #4727

Open

Copilot AI review requested due to automatic review settings February 15, 2026 12:24

Copilot started reviewing on behalf of devzizu February 15, 2026 12:24 View session

devzizu force-pushed the ft-proposal-4710 branch from 0f96f9a to 75d95dc Compare February 15, 2026 12:25

volcano-sh-bot added the do-not-merge/contains-merge-commits label Feb 15, 2026

Copilot AI reviewed Feb 15, 2026

View reviewed changes

devzizu marked this pull request as draft February 15, 2026 12:33

volcano-sh-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 15, 2026

devzizu force-pushed the ft-proposal-4710 branch from d0bbf95 to e50491c Compare February 15, 2026 14:35

volcano-sh-bot removed the do-not-merge/contains-merge-commits label Feb 15, 2026

hajnalmt reviewed Feb 18, 2026

View reviewed changes

volcano-sh-bot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Feb 18, 2026

devzizu added 20 commits February 18, 2026 14:40

feat: add scheduling gates for queue admission control

ca0549c

Signed-off-by: devzizu <[email protected]>

fix: revert log changes to master branch

81e8a4b

Signed-off-by: devzizu <[email protected]>

test: add utils and use pods instead of vcjobs

6e1f3cd

Signed-off-by: devzizu <[email protected]>

refactor: enhance E2E test descriptions for clarity and context

9db0c05

Signed-off-by: devzizu <[email protected]>

refactor: enhance gate removal configuration

552201f

Signed-off-by: devzizu <[email protected]>

refactor: update scheduling gate references and logging levels for co…

08b2203

…nsistency Signed-off-by: devzizu <[email protected]>

refactor: rename scheduling gate utility function for clarity

b9526b0

Signed-off-by: devzizu <[email protected]>

fix: rebase design with v1.14 release

df00646

Signed-off-by: devzizu <[email protected]>

refactor(docs): ensure context lines match the new v1.14 release

1295e67

Signed-off-by: devzizu <[email protected]>

fix: use context.Background() and remove dead code

3cadd88

Signed-off-by: devzizu <[email protected]>

fix: ensure gate is removed after allocation check

7cff403

Signed-off-by: devzizu <[email protected]>

refactor: replace boolean flag with sync.Once for worker initialization

01650c7

Signed-off-by: devzizu <[email protected]>

fix: add error checks and improve logging

d5bad0f

Signed-off-by: devzizu <[email protected]>

tests: remove unused HasSchedulingGatedCondition helper

cde7513

Signed-off-by: devzizu <[email protected]>

fix: remove candidate.SchGated check when adding to reserved cache

cf4e73c

Signed-off-by: devzizu <[email protected]>

fix: add shutdownOnce to ensure UnInitialize body runs only once

3c6e5ec

Signed-off-by: devzizu <[email protected]>

fix: update patch operation for scheduling gates to preserve existing…

9b14a8d

… values Signed-off-by: devzizu <[email protected]>

tests: add e2e test for third-party gates

82a619c

Signed-off-by: devzizu <[email protected]>

fix: add warning log when users do not set annotation

c23023f

Signed-off-by: devzizu <[email protected]>

devzizu force-pushed the ft-proposal-4710 branch from b2c7960 to c23023f Compare February 18, 2026 14:40

volcano-sh-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 18, 2026

docs: exclude design from feature branch

0e40cee

Signed-off-by: devzizu <[email protected]>

devzizu force-pushed the ft-proposal-4710 branch from 4c3064c to 0e40cee Compare February 18, 2026 14:44

fix: ensure idempotent initialization of async gate removal workers

1d6ae98

Signed-off-by: devzizu <[email protected]>

Conversation

devzizu commented Feb 8, 2026

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Uh oh!

volcano-sh-bot commented Feb 8, 2026

Uh oh!

gemini-code-assist bot commented Feb 8, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JesseStutler commented Feb 10, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hajnalmt left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

hajnalmt Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

devzizu Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hajnalmt left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

hajnalmt left a comment •

edited

Loading

hajnalmt Feb 18, 2026 •

edited

Loading