util: add GroupLock for two-group mutual exclusion by gadididi · Pull Request #6183 · ceph/ceph-csi

gadididi · 2026-03-16T13:36:57Z

Add GroupLock type that allows operations within the same group to run concurrently while blocking operations from the other group. This wiil be in used to coordinate NVMe-oF Stage and Unstage operations - multiple Stage calls can run in parallel,
multiple Unstage calls can run in parallel, but Stage and Unstage block each other.

This can be in used in any place which needs lock between 2 types of group.

The implementation is based on Keane and Moir's "group mutual exclusion algorithm", but simplified for exactly two groups using Golang condition variables.
Operations in group A wait if any group B operations are active, and vice versa.
When the last operation in a group completes, all waiting operations in the other group are woken up together.

Also added some test, simulate parallel workers:

Multiple same-group operations run concurrently
Different groups block each other
No deadlock under heavy contention
Mutual exclusion is never violated
All waiting operations start together when unblocked

Related PR

#6163 - once this phase1 PR will be merged, the next phase is adding group lock for connect\ disconnect OP.

Checklist:

Commit Message Formatting: Commit titles and messages follow
guidelines in the developer
guide.
Reviewed the developer guide on Submitting a Pull
Request
Pending release
notes
updated with breaking and/or notable changes for the next major release.
Documentation has been updated, if necessary.
Unit tests have been added, if necessary.
Integration tests have been added, if necessary.

Show available bot commands

These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:

/retest ci/centos/<job-name>: retest the <job-name> after unrelated
failure (please report the failure too!)

nixpanic

Sounds like what we can use.

I wonder if this is over-kill for the rare cases this would be an issue? Is there an alternative that would be simpler?

In theory this could cause blocking for long time. If one group (say NodeStageVolume) gets called often, in quick succession, a NodeUnstageVolume could be blocked until all NodeStageVolume calls are handled. I don't think that will be problematic for our use now, but it may be something to document so other users are warned?

gadididi · 2026-03-17T06:57:05Z

@nixpanic , Hi!

Is there an alternative that would be simpler?

I searched for Group Mutual Exclusion util in Go, but I did not find..

In theory this could cause blocking for long time

right. I can make it more fair with kind of queue (like in the origin algorithm here: https://dl.acm.org/doi/10.1145/301308.301319) .
In this way for sure we will not have any starvation. the big Q is it really common hundreds of pods can be created together?
I can document it.
I will figure out how difficult\over-kill it will be to add the queue.

UPDATE ABOUT THE QUEUE:

adding fairness (like a FIFO queue) would make the code significantly more complex - we'd need to manage a wait queue, allocate queue entries, and handle wakeup ordering.
I think also Go's own sync.RWMutex doesn't guarantee fairness, writers can starve readers and vice versa..

I'll add a comment documenting this limitation so future users are aware.
something like:

// GroupLock does not guarantee fairness. Under heavy load from one group,
// the other group may experience temporary delays.

what do you think?

In our case, when few pods are created\deleted in same time I think there is big chance for:

time | description

T1 - NodeStageVolume() for podA is called- try to run nvme connect but see that is already connected - ✅
T2 - NodeUnStageVolume() for podB is called- it sees no more mounted paths exists , so it runs nvme disconnect - ✅
T3 - NodeStageVolume() for podA then continue to run, it tries to mount the staging path - but failed- the nvme device does not exist anymore. - ❌

nixpanic · 2026-03-17T12:30:16Z

There is no need to implement fairness/queuing. I do not expect significant issues for the intended use-case (NVMe-oF connect/disconnect). The number of operations that will get blocked should be minimal as (un)staging volumes is not done in a continuous stream anyway.

A note like the proposed

// GroupLock does not guarantee fairness. Under heavy load from one group,
// the other group may experience temporary delays.

would be very good to have as a reminder though.

Add GroupLock type that allows operations within the same group to run concurrently while blocking operations from the other group. This wiil be in used to coordinate NVMe-oF Stage and Unstage operations - multiple Stage calls can run in parallel, multiple Unstage calls can run in parallel, but Stage and Unstage block each other. This can be in used in any place which needs lock between 2 types of group. The implementation is based on Keane and Moir's "group mutual exclusion algorithm", but simplified for exactly two groups using Golang condition variables. Operations in group A wait if any group B operations are active, and vice versa. When the last operation in a group completes,all waiting operations in the other group are woken up together. Also added some test, simulate parallel workers: - Multiple same-group operations run concurrently - Different groups block each other - No deadlock under heavy contention - Mutual exclusion is never violated - All waiting operations start together when unblocked Signed-off-by: gadi-didi <[email protected]>

gadididi requested a review from nixpanic March 16, 2026 13:36

gadididi self-assigned this Mar 16, 2026

gadididi added the component/util Utility functions shared between CephFS and RBD label Mar 16, 2026

gadididi force-pushed the util/add_group_locking branch from d07791d to 667953c Compare March 16, 2026 14:58

gadididi requested review from Madhu-1 and Rakshith-R March 16, 2026 14:58

nixpanic reviewed Mar 16, 2026

View reviewed changes

gadididi requested a review from nixpanic March 17, 2026 08:08

gadididi force-pushed the util/add_group_locking branch from 667953c to 5574e4f Compare March 17, 2026 13:57

nixpanic approved these changes Mar 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

util: add GroupLock for two-group mutual exclusion#6183

util: add GroupLock for two-group mutual exclusion#6183
gadididi wants to merge 1 commit intoceph:develfrom
gadididi:util/add_group_locking

gadididi commented Mar 16, 2026 •

edited

Loading

Uh oh!

nixpanic left a comment

Uh oh!

gadididi commented Mar 17, 2026 •

edited

Loading

Uh oh!

nixpanic commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gadididi commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related PR

Uh oh!

nixpanic left a comment

Choose a reason for hiding this comment

Uh oh!

gadididi commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

UPDATE ABOUT THE QUEUE:

time | description

Uh oh!

nixpanic commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gadididi commented Mar 16, 2026 •

edited

Loading

gadididi commented Mar 17, 2026 •

edited

Loading