Skip to content

DRPC API Extension: Add TestFailover action#2416

Draft
am-agrawa wants to merge 7 commits intoRamenDR:mainfrom
am-agrawa:7937-add-drpc-new-action
Draft

DRPC API Extension: Add TestFailover action#2416
am-agrawa wants to merge 7 commits intoRamenDR:mainfrom
am-agrawa:7937-add-drpc-new-action

Conversation

@am-agrawa
Copy link
Member

No description provided.

@am-agrawa am-agrawa marked this pull request as draft February 11, 2026 13:13
@am-agrawa am-agrawa force-pushed the 7937-add-drpc-new-action branch from dbe6a08 to 0b2ae9a Compare February 11, 2026 13:30
@am-agrawa am-agrawa force-pushed the 7937-add-drpc-new-action branch 3 times, most recently from 577b2df to dcaf8c5 Compare February 12, 2026 10:21
}
}

func (d *DRPCInstance) convertStateForTestIfNeeded(nextState rmn.DRState) rmn.DRState {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the function name to be something like adjustPhaseIfTestFailover

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return nextState
}

func getTestFailoverPhase(nextState rmn.DRState) rmn.DRState {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and how about this one? I would change it to mapPhaseForTestFailover.
In that case, you would get something like:
adjustPhaseIfTestFailover --> mapPhaseForTestFailover
One is conditional, the other one is basically do it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@am-agrawa am-agrawa force-pushed the 7937-add-drpc-new-action branch 2 times, most recently from 24c4987 to 660c4b6 Compare February 16, 2026 11:26
ProgressionDeleting = ProgressionStatus("Deleting")
ProgressionDeleted = ProgressionStatus("Deleted")
ProgressionActionPaused = ProgressionStatus("Paused")
ProgressionTestFailover = ProgressionStatus("TestingFailover")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep the same patter naming. The progression variable is named as: Progression + progression name.
so use ProgressionTestingFailover instead of ProgressionTestFailover.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@am-agrawa am-agrawa force-pushed the 7937-add-drpc-new-action branch 2 times, most recently from 7df83e6 to 4e1c0d8 Compare February 17, 2026 19:11
@am-agrawa am-agrawa force-pushed the 7937-add-drpc-new-action branch from 4e1c0d8 to 74dd6ab Compare February 18, 2026 07:59
@BenamarMk
Copy link
Member

BenamarMk commented Feb 22, 2026

@am-agrawa I pushed two commits. One fixes a bug I ran into, and the other adds support for the TestFailover action, including abort handling for Initial Deploy, Failover, and Relocate.

I tested the following order:

  1. Initial Deployment
  2. TestFailover
  3. Abort
  4. Failover
  5. TestFailover
  6. Abort
  7. Relocate
  8. TestFailover
  9. Abort
  10. Failover
  11. Failover
  12. TestFailover
  13. Abort

All of them return to the previous action cleanly after an Abort of the test, with no issues observed.
I turned off a few linter errors. We'll fix them later.

Introduce non-destructive TestFailover action to verify secondary cluster
readiness without committing to failover.

- Add VRGActionTestFailover and update CRD enums/YAML
- Implement placement logic, cleanup, and action execution refactor
- Exclude test primaries from multi-primary checks
- Restore original placement decisions after test failover
- Treat TestFailover like Failover for resync and VolSync restore
- Skip LastAppDeploymentCluster updates during test failover
- Improve comments, readability, and lint compliance

Signed-off-by: Benamar Mekhissi <[email protected]>
@BenamarMk BenamarMk force-pushed the 7937-add-drpc-new-action branch from 0b09a0a to bb49337 Compare February 23, 2026 21:59
@am-agrawa am-agrawa force-pushed the 7937-add-drpc-new-action branch from bb49337 to 0b09a0a Compare February 24, 2026 07:54
@BenamarMk BenamarMk force-pushed the 7937-add-drpc-new-action branch from 3e3f99c to bb49337 Compare February 24, 2026 13:35
Replace the separate ActionTestFailover action type with a simpler attribute-based
approach using a DryRun boolean field. This cleaner design separates concerns:
- Action Failover indicates the operation to perform
- DryRun boolean indicates if the operation should be non-destructive/test mode
- Progression status (TestingFailover) continues to indicate test mode

Changes:
- Remove ActionTestFailover from DRAction and VRGAction enums
- Remove TestFailover and TestFailedOver DRState constants
- Add DryRun field to DRPlacementControlSpec and VolumeReplicationGroupSpec
- Update all references to ActionTestFailover to check DryRun flag instead

The progression status ProgressionTestingFailover is retained as it provides
a unified indicator of test mode across both DRPC and VRG resources.
@BenamarMk BenamarMk force-pushed the 7937-add-drpc-new-action branch from bb49337 to db88f3c Compare February 24, 2026 13:38
- Upload continues from the primary managed cluster to S3 stores on both sides
- Upload stops from the failoverCluster on both S3 when dryRun is set to True when action is Failover

Signed-off-by: Aman Agrawal <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants