Skip to content

Prevent OLM upgrades from overwriting operator-config ConfigMap#2441

Draft
ELENAGER wants to merge 2 commits intoRamenDR:mainfrom
ELENAGER:DFBUGS-440-new
Draft

Prevent OLM upgrades from overwriting operator-config ConfigMap#2441
ELENAGER wants to merge 2 commits intoRamenDR:mainfrom
ELENAGER:DFBUGS-440-new

Conversation

@ELENAGER
Copy link
Member

@ELENAGER ELENAGER commented Mar 2, 2026

Problem: During OLM upgrades, the operator-config ConfigMap gets re-applied from the bundle/CSV content. As a result, user-managed fields in RamenConfig (inside the S3 configuration) disappear after upgrades because they are replaced by the default config shipped with the bundle.

Fix: Split defaults from user configuration. OLM now generates a defaults ConfigMap with a -default suffix, while the operator manages a separate user ConfigMap at runtime. On startup (and after upgrades), the operator reads the *-default ConfigMap as the current defaults, then:

  • if a user ConfigMap exists, it preserves the user’s S3 settings by merging S3StoreProfiles into the defaults and updates the user ConfigMap if needed;

  • if no user ConfigMap exists, it creates one from the defaults.

This ensures OLM upgrades only update the defaults ConfigMap, while user S3 configuration is preserved across upgrades.

Fixes: https://issues.redhat.com/browse/DFBUGS-440

@nirs
Copy link
Member

nirs commented Mar 3, 2026

Fix: Stop generating/owning the ConfigMap via OLM manifests and instead have the Ramen operator create/update it at runtime. When updating, the operator merges the existing ConfigMap’s user-defined S3 section into the current default RamenConfig and writes the merged config back, preserving user settings across upgrades.

Why ramen operator need to create or update the config?

Please explain these flows:

  1. New install via OLM
  2. New install using ramendev (or manually running make deploy*)
  3. Upgrade with user installed configmap

Ideally the flows should be documented in docs/configmap.md.

How ramen can tell if a user want to keep a default option? For example if a boolean options are unset, how ramen can tell if unset option in user configmap should be false or should sue the value from ramen default config?

@ELENAGER
Copy link
Member Author

ELENAGER commented Mar 3, 2026

Flows

1) New install via OLM

  • OLM installs the operator and creates the defaults ConfigMap (with the -default suffix).
  • On startup, the operator:
    1. loads built-in/file defaults,
    2. reads the OLM *-default ConfigMap (becomes the current defaults),
    3. looks for the user ConfigMap (non-default name).
  • If the user ConfigMap does not exist yet, the operator creates it from the current defaults.
  • Result: two ConfigMaps:
    • *-default (owned/updated by OLM)
    • user ConfigMap (created/updated by the operator; safe for user edits)

2) New install via ramendev / make deploy*

  • No OLM, so the *-default ConfigMap does not exist.
  • On startup, the operator:
    1. loads built-in/file defaults,
    2. does not find *-default → keeps built-in defaults,
    3. looks for the user ConfigMap.
  • If the user ConfigMap does not exist yet, the operator creates it from built-in defaults.
  • Result: usually only the user ConfigMap exists, initialized from built-in defaults.

3) Upgrade with a user-installed ConfigMap

  • During upgrade, OLM re-applies/updates only the *-default ConfigMap.
  • The user ConfigMap remains (not owned by OLM anymore).
  • On restart, the operator:
    1. reads *-default as the new defaults,
    2. reads the existing user ConfigMap,
    3. merges by preserving S3StoreProfiles from the user config into the new defaults.
  • If needed, it updates the user ConfigMap with the merged result.
  • Result: new defaults are picked up while user S3 settings are preserved across upgrades.

@nirs
Copy link
Member

nirs commented Mar 3, 2026

Flows

1) New install via OLM

  • OLM installs the operator and creates the defaults ConfigMap (with the -default suffix).

  • On startup, the operator:

    1. loads built-in/file defaults,

So we store all defaults in the executable as constants?
What about controller type (dr-hub, dr-cluster)?

  1. reads the OLM *-default ConfigMap (becomes the current defaults),

Why we need to read the defaults from this new config map if we store the default in the executable (based on step 1)?

  1. looks for the user ConfigMap (non-default name).
  • If the user ConfigMap does not exist yet, the operator creates it from the current defaults.

Why we need to create the configmap? We already know the defaults, and there are no user config, so we use the defaults.

  • Result: two ConfigMaps:

    • *-default (owned/updated by OLM)
    • user ConfigMap (created/updated by the operator; safe for user edits)

User owned configmap makes sense, since the user owns the configuration. Ramen should not create or update this file since it does not own the file.

2) New install via ramendev / make deploy*

  • No OLM, so the *-default ConfigMap does not exist.

ramendev can create the -default file - but since we know the defaults, there is no need to create this file.

  • On startup, the operator:

    1. loads built-in/file defaults,
    2. does not find *-default → keeps built-in defaults,
    3. looks for the user ConfigMap.
  • If the user ConfigMap does not exist yet, the operator creates it from built-in defaults.

  • Result: usually only the user ConfigMap exists, initialized from built-in defaults.

This proves that the -default file is not need - we can start successfully without it.

3) Upgrade with a user-installed ConfigMap

  • During upgrade, OLM re-applies/updates only the *-default ConfigMap.
  • The user ConfigMap remains (not owned by OLM anymore).

Makes sense - this is the actual fix - OLM never touch the user config.

return nil, err
}

defaultsCfg = cfg
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/defaultsCfg/mergedCfg

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


log.Info("created configmap", "namespace", newCM.Namespace, "name", newCM.Name)

return defaultRamenConfig, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return defaultsCfg instead

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// Desired Ramen config: defaults + preserve S3 from existing CM
desiredRamenConfig := defaultsCfg
desiredRamenConfig.S3StoreProfiles = existingCfg.S3StoreProfiles

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to do the following for the other fields

  1. if existingCfg doesn't have the field populated, then use the defaultsCfg value
  2. if existingCfg has any value(even the default for that type), then use from the existingCfg

@ELENAGER
Copy link
Member Author

ELENAGER commented Mar 3, 2026

So we store all defaults in the executable as constants? What about controller type (dr-hub, dr-cluster)?

  • Defaults are not compiled in; they come from the mounted config file (that’s what we pass in as defaultRamenConfig). The controller type (hub vs dr-cluster) is also derived from that startup config file.

Why we need to read the defaults from this new config map if we store the default in the executable (based on step 1)?

  • Because we don’t store defaults in the executable — the file is just the baseline. We read the *-default ConfigMap so that the defaults can be owned and updated by OLM (i.e., the defaults that match the installed bundle/CSV). That way, upgrades can change defaults without changing the image or the mounted file, and the operator still has the file as a fallback if the defaults CM isn’t present.

Why we need to create the configmap? We already know the defaults, and there are no user config, so we use the defaults.

  • I don't a good answer, maybe @raghavendra-talur can help me. My only guess is that makes the effective config visible and persistent in-cluster (easy to inspect/backup/debug)

User owned configmap makes sense, since the user owns the configuration. Ramen should not create or update this file since it does not own the file.

  • Ramen only creates or changes user ConfigMap

ramendev can create the -default file - but since we know the defaults, there is no need to create this file.

  • ramendev flow is tested, it works in it's current form. No alterations needed

This proves that the -default file is not need - we can start successfully without it.

  • I keep the *-default ConfigMap also to minimize changes in the OLM bundle/CSV.
    Instead of removing ConfigMap handling from the bundle entirely (bigger refactor and higher risk), I only:
  • rename/split the ConfigMap so OLM owns a *-default defaults CM, and
  • let the operator merge the user CM.

That keeps the bundle structure largely the same, reduces upgrade risk, and still fixes the “OLM overwrites user config” problem. Also it gives a cluster-visible “what defaults does this bundle ship?” object that support/QE can inspect.

@ELENAGER ELENAGER force-pushed the DFBUGS-440-new branch 2 times, most recently from 04dc32a to 6fcfe57 Compare March 4, 2026 09:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants