Simplified rollout triggers and CRD design doc#34959
Conversation
doc/developer/design/20260209_simplified_rollout_triggers_and_crd.md
Outdated
Show resolved
Hide resolved
doc/developer/design/20260209_simplified_rollout_triggers_and_crd.md
Outdated
Show resolved
Hide resolved
doc/developer/design/20260209_simplified_rollout_triggers_and_crd.md
Outdated
Show resolved
Hide resolved
doc/developer/design/20260209_simplified_rollout_triggers_and_crd.md
Outdated
Show resolved
Hide resolved
doc/developer/design/20260209_simplified_rollout_triggers_and_crd.md
Outdated
Show resolved
Hide resolved
doc/developer/design/20260209_simplified_rollout_triggers_and_crd.md
Outdated
Show resolved
Hide resolved
doc/developer/design/20260209_simplified_rollout_triggers_and_crd.md
Outdated
Show resolved
Hide resolved
doc/developer/design/20260209_simplified_rollout_triggers_and_crd.md
Outdated
Show resolved
Hide resolved
doc/developer/design/20260209_simplified_rollout_triggers_and_crd.md
Outdated
Show resolved
Hide resolved
| ###### v1alpha1 to v1alpha2: | ||
| - Spec fields: | ||
| - `forcePromote: Uuid` becomes `forcePromote: Option<String>` (nil UUID becomes None) | ||
| - `requestRollout` is removed. | ||
| - Status fields: | ||
| - `lastCompletedRolloutRequest` and `resourcesHash` are removed. | ||
| - If `lastCompletedRolloutRequest` and `spec.requestRollout` match: | ||
| - This means we weren't in the middle of a rollout. | ||
| - `lastCompletedRolloutHash` and `requestedRolloutHash` should both be set to the calculated hash (after conversion). This should avoid triggering a rollout during the migration. | ||
| - Else: | ||
| - `lastCompletedRolloutHash` should be set to `None` and `requestedRolloutHash` should be set to the calculated hash (after conversion). In this case, we likely have an in-progress rollout, which we will destroy and replace. |
There was a problem hiding this comment.
If I'm reading this right, when lastCompletedRolloutRequest != spec.requestRollout, the conversion sets lastCompletedRolloutHash to None, which would restart the rollout from scratch. What we might want to consider is what happens if the existing rollout is already partway through promoting when the orchestratord upgrade happens?
There was a problem hiding this comment.
If we're in promoting status, we unconditionally call the promotion for the current rollout. After updating the status after completing promotion, if anything is changed, it should trigger another rollout.
I'm pretty sure this does the right thing as-is, but this is probably something we want to write a test for.
I'll update the doc to call this out.
There was a problem hiding this comment.
@doy-materialize I've just pushed an update that addresses Bobby's feedback. As I was working my way through the promoting status scenario, I realized that maybe we need to have status.requestedRolloutHash be an Option. In cases where we're converting and the v1alpha1 is mid-rollout and already promoting, we don't want to set status.requestedRolloutHash to the current calculated hash. We have no way of knowing the hash of the spec that reached promoting status, so we should set it to None.
bobbyiliev
left a comment
There was a problem hiding this comment.
This looks good to me! Will leave the final check to @doy-materialize, though.
| Introduce a new `v1alpha2` version of the Materialize CRD with the following changes: | ||
|
|
||
| **Spec changes:** | ||
| - Remove `requestRollout` (`Uuid`) - Rollouts are now triggered automatically when the spec hash changes. |
There was a problem hiding this comment.
Nice! this has always bugged me
|
|
||
| The `forcePromote` field is excluded, since it is used to promote the existing generation, and we don't want to tear that down every time it changes. | ||
|
|
||
| Fields **included** in the hash (changes trigger rollout): |
There was a problem hiding this comment.
We know this, so just commenting for completeness sake: whenever we build this, we should add this info into the docs
Simplified rollout triggers and CRD design doc ### Motivation * This PR adds a known-desirable feature. Part of https://linear.app/materializeinc/issue/DEP-7/design-simplifying-upgrade-rollouts-node-rolls-converted-to-project ### Tips for reviewer <!-- Leave some tips for your reviewer, like: * The diff is much smaller if viewed with whitespace hidden. * [Some function/module/file] deserves extra attention. * [Some function/module/file] is pure code movement and only needs a skim. Delete this section if no tips. --> ### Checklist - [x] This PR has adequate test coverage / QA involvement has been duly considered. ([trigger-ci for additional test/nightly runs](https://trigger-ci.dev.materialize.com/)) - [x] This PR has an associated up-to-date [design doc](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/README.md), is a design doc ([template](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/design/00000000_template.md)), or is sufficiently small to not require a design. <!-- Reference the design in the description. --> - [x] If this PR evolves [an existing `$T ⇔ Proto$T` mapping](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/command-and-response-binary-encoding.md) (possibly in a backwards-incompatible way), then it is tagged with a `T-proto` label. - [x] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label ([example](MaterializeInc/cloud#5021)). <!-- Ask in #team-cloud on Slack if you need help preparing the cloud PR. --> - [x] If this PR includes major [user-facing behavior changes](https://github.com/MaterializeInc/materialize/blob/main/doc/developer/guide-changes.md#what-changes-require-a-release-note), I have pinged the relevant PM to schedule a changelog post.
Simplified rollout triggers and CRD design doc
Motivation
Part of https://linear.app/materializeinc/issue/DEP-7/design-simplifying-upgrade-rollouts-node-rolls-converted-to-project
Tips for reviewer
Checklist
$T ⇔ Proto$Tmapping (possibly in a backwards-incompatible way), then it is tagged with aT-protolabel.