Skip to content

core: add dynamic managed storage dispatcher#1081

Open
sp-viktori wants to merge 7 commits intooVirt:masterfrom
sp-viktori:managed-storage-dispatcher
Open

core: add dynamic managed storage dispatcher#1081
sp-viktori wants to merge 7 commits intooVirt:masterfrom
sp-viktori:managed-storage-dispatcher

Conversation

@sp-viktori
Copy link

@sp-viktori sp-viktori commented Oct 16, 2025

Add an optional adapter field in the managed storage driver definition which will redirect all managed storage requests to a third party executable. This allows third party vendors to implement a direct storage interface instead of going through OpenStack/Cinder.

A similar dispatch mechanism is done for Vdsm in a separate PR.

Changes introduced with this PR

  • Add an optional adapter field in the managed storage driver_info map.

  • If a managed storage has the new "adapter" field, instead of running the cinderlib-client.py helper, the CinderlibExecutor will instead direct all requests to an {adapter}-adapter helper. This is intended to be a vendor-provided executable implementing the same protocol as the default Cinderlib client.

Are you the owner of the code you are sending in, or do you have permission of the owner?

Yes.

@sp-viktori
Copy link
Author

I think I've misread the code ownership question, so I've changed it to "yes" (originally I thought if this was regarding the code I'm changing, not the code I'm sending in).

try {
Map<String, Object> info = JsonHelper.jsonToMap(driverInfo);
if (info.containsKey("adapter")) {
adapterFile = info.get("adapter") + "-adapter";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The adapter will have no extension?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was considering keeping .py but then someone might write their plugin in some other language? It should work with anything following the protocol.

Our integration (although using Python as well) has no extensions since we install the executables as scripts (project.scripts in pyproject.toml).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Fine for me!

EngineLocalConfig config = EngineLocalConfig.getInstance();
private static final Logger log = LoggerFactory.getLogger(CinderlibExecutor.class);
private static final String CINDERLIB_PREFIX = "./cinderlib-client.py";
private static final String DEFAULT_ADAPTER = "cinderlib-client.py";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name it CINDERLIB_ADAPTER

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (also renamed the file as part of cinderlib -> managedblock refactoring)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And to make everything cleaner we should in fact refactor all of this so it's not cinderlib/cinderlibexecutor anymore.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started refactoring cinderlib -> managedblock.

There are some more (heavier) changes that I need to do as well, but first I wanted to test this "midpoint" where I've only renamed Java objects. Next push will include the accompanying scripts as well.

I plan to keep the cinderlib-client.py unchanged since it is indeed a CinderLib client. But probably will rename the directory that contains it (in the packaging) from cinderlib to managedblock just like I did for the Java package.

Another thing is the CINDERLIB_DB: this is obviously required when adding a CinderLib-backed Managed Block Storage but actually our StorPool integration also depends on this DB to keep a mapping between oVirt UUIDs and our StorPool global IDs (we prefix our tables so they don't clash and might even use PostgreSQL schema to further isolate them).

In light of this I'm considering renaming that to a more generic MANAGEDBLOCK_DB, does this sound good?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cinderlib-client.py can indeed stay as name, as it has only cinderlib code.
And I agree on renaming the directory.

MANAGEDBLOCK_DB sounds good!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ended up being a bit of a rabbit hole, but it's not as bad as it looks 🤞

There are still a couple of things to figure out and test properly:

  1. Config (Setup) migration from all the *CINDERLIB* vars to the new *MANAGEDBLOCK* vars. I think if I add a setup step somewhere in one of the early stages I can detect the old cinderlib confs and migrate things (there is likely no need to be too smart about it and try to rename the cinderlib db itself).
  2. engine backup and restore - I think if I fix point 1 above this will work out of the box, but need to test it. If I understand how things work, this is absolutely required for a hosted engine upgrade since it always must restore from backup?

If you have any hints about best way to proceed with either point, it will be much appreciated 🙏

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I think it should be ok indeed to just parse current values and write the new ones. The cinderlib db itself is used for mapping between the cinderlib disk uuid and the ovirt disk uuid? If that is the case, it would be cleaner to rename it yes. But perhaps those things can be done in separate commits.
  • Commit that converts everything to managedblock in the code
  • Commit to rename the db
  • Commit that add's the dispatcher
  • etc
    This to keep commits small :)
  1. Backup/Restore should in fact not take care of the rename. As after a restore, you NEED to run engine-setup. And there it should then rename the stuff :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cinderlib db itself is used for mapping between the cinderlib disk uuid and the ovirt disk uuid?

For CinderLib the uuid mapping is a small part, I think it has the full Cinder state there in many more tables. But it is all CinderLib-specific. Our integration just uses it for the UUID mapping (and a schema version tracking table).

Regarding the commits, I've tried to keep them focused on a single thing but will review again if I can split them further. I'm trying to at the very least keep the build working without errors on each commit.

Regarding engine-backup, if the restore happens before engine-setup runs, does that mean that I need to make engine-backup aware of this refactoring as well?

If I understand correctly, it happens like this:

  1. We have oVirt 4.5.7 with cinderlib, using ovirt config settings such as CINDERLIB_DB_HOST.
  2. The admin decides to upgrade. The following is how I think the process happens (but likely wrong):
  3. (The admin?) makes a backup using engine-backup. This includes cinderlib because it sees OVESETUP_CL_DB/enable and the resulting file is CINDERLIBDB_BACKUP_FILE_NAME -> cinderlib_backup.db.
  4. The VM is powered off and replaced with a fresh one using 4.5.8
  5. First engine-backup is run with --restore. At this point if the engine-backup script is not aware of the 4.5.7 setup & config options it will not restore the managed block DB (since it looks for MANAGEDBLOCK_BACKUP_FILE_NAME which is now managedblock_backup.db).
  6. engine-setup (which is aware of the old cinderlib naming) will correctly update the config options (once I add this step, not developed yet) but there will be no pre-upgrade DB being restored, so things won't work without manual intervention.

I think I need to make engine-backup aware of the older config options as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds about right.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I'll check aaaupgrade.py and push again once I have the restore/migration stuff ready for another review. Might take a while to test things 🤞

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed new changes with the additional migration steps:

  1. I've returned most of the engine-backup code that was dealing with the legacy cinderlib DB, but only when restoring from backup.
  2. I've added a cinderlib_rename plugin in engine-setup that will trigger as a fallback if the (new) managed block settings were not loaded. In this case it will look for the legacy Cinderlib settings and try to migrate them. It will also do the usual DB test to verify that the DB is accessible before accepting these settings.

I've tested the migrate flow on my dev setup:

  1. Installed 4.5.7, set it up using the legacy cinderlib settings.
  2. Made a backup.
  3. On a 4.5.8 running from my branch, made a clean install.
  4. Did engine-backup, which restored the legacy cinderlib DB + config options.
  5. Did engine-setup, which loaded the legacy cinderlib options and renamed them to managedblock; this resulted in the new config files being created as well.
  6. Started oVirt engine to verify the managed storage is there, VMs can be started, etc.

Next I'll do some more test flows (like brand new deployment directly with the new managed block settings, backup and restore with the new settings).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed a small update to the refactoring with the last suggested edits together with these two additional changes:

  1. Found a small bug caused during libvirt XML dom building: all "BLOCK" type managed block devices were treated as if they have SCSI WWN. Now if the relevant "scsi_wwn" field is not in the attachment info (that came from the VDS) the XML builder won't try to populate it in the DOM. This didn't cause any operational issues but there was an unhandled exception in the engine log. This fix is a separate commit in the PR.
  2. Found a legacy class that was not renamed (CinderVolumeDriver) - the bug above pointed me to this class actually...

@sp-viktori sp-viktori force-pushed the managed-storage-dispatcher branch from a1b1746 to f7ffbcd Compare January 28, 2026 12:55
Copy link
Member

@dupondje dupondje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

Looks good. Except, can you split the CinderlibExector part and the dynamic managed storage part into 2 commits (can be on the same PR).
Just so clearly see what is refactoring, and what is additional code.

Add missing /repository to the default .gitignore (present in
.gitignore.in).

Signed-off-by: Viktor Ivanov <viktor.ivanov@storpool.com>
@sp-viktori sp-viktori force-pushed the managed-storage-dispatcher branch 2 times, most recently from dc6213e to 6dc277f Compare February 4, 2026 08:30
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also drop the old functions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, added drops for the old function (by name and arguments) in the latest update.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds about right.

@sp-viktori sp-viktori force-pushed the managed-storage-dispatcher branch from 6dc277f to 2dfcfdd Compare February 19, 2026 15:01
@@ -0,0 +1,115 @@
#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One remark. I think that the "10-setup-cinderlib-database.conf" isn't deleted anywhere? So you will end up with duplicate config (albeit with different property names).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, along with the other one "10-setup-cinderlib.conf". I'll see if I can add a custom transaction to delete them on commit (or rather, rename them to their backup form so this is reversible). Ultimately this would be something FileTransaction from otopi provides out of the box but this change set is already a monster so not sure about adding another PR...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a "commit" stage plugin during transaction end to check for these legacy files and rename them as if they were backed up and deleted. Only happens if the setup stage plugin detected the legacy option and managed to rename them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, lost track of this 😓
That last change looks good, I've got no more remarks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries! Also I found out some python style violations and also I think I managed to break some of the unit tests, so I need to double check everything and run the tests again.

On the topic of testing, apart from my manual tests and the unit tests, do you have something more exhaustive/end-to-end in mind?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed an update fixing the python style violations.

On the topic of failing unit tests, turned out it was interference from my VS Code environment remote-ssh-ing to the same dev machine where I ran the tests manually. The VS Code apparently broke the dev env because the underlying Eclipse(?) JDT fails to build some of the classes, causing unit tests to fail with "missing class".

I ran the test suite again (without VS Code being connected), then did another manual test (restore from 4.5.7 backup and test the migration and then do some VM actions). No issues.

Refactor "CinderlibExecutor" classes to a more generic
"ManagedBlockExecutor".

Rename the log directory previously only meant to hold the CinderLib log
file to a more generic "managedblock" log directory.

Rename the data directory previously only meant to hold the CinderLib
client to a more generic "managedblock" directory. The existing
CinderLib client remains `cinderlib-client.py.in` as well as its log
file name and etc subdirectory containing strictly CinderLib-related
files.

Signed-off-by: Viktor Ivanov <viktor.ivanov@storpool.com>
Refactor CinderStorageDao and related classes to ManagedBlockStorageDao.

Rename the cinder_storage table in the engine database to mb_storage,
including related stored procedures and SQL migration scripts.

Signed-off-by: Viktor Ivanov <viktor.ivanov@storpool.com>
@sp-viktori sp-viktori force-pushed the managed-storage-dispatcher branch from 2dfcfdd to f5edf72 Compare March 10, 2026 15:04
Rename CinderLib DB setup to a more generic "ManagedBlock".

Signed-off-by: Viktor Ivanov <viktor.ivanov@storpool.com>
Rename cinderlib DB to managedblock DB in the engine-backup script.

Signed-off-by: Viktor Ivanov <viktor.ivanov@storpool.com>
Do not populate GUID metadata for BLOCK type attachments if they don't
have the `scsi_wwn` field populated.

Signed-off-by: Viktor Ivanov <viktor.ivanov@storpool.com>
Add an optional "adapter" field in the managed storage driver definition
which will redirect all managed storage requests to a third party
executable.

Signed-off-by: Viktor Ivanov <viktor.ivanov@storpool.com>
@sp-viktori sp-viktori force-pushed the managed-storage-dispatcher branch from f5edf72 to 08b6074 Compare March 18, 2026 08:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants