fix(aws/rds): harden DBCluster reconcile + add lifecycle convergence tests#224
Draft
sam-goodwin wants to merge 1 commit intomainfrom
Draft
fix(aws/rds): harden DBCluster reconcile + add lifecycle convergence tests#224sam-goodwin wants to merge 1 commit intomainfrom
sam-goodwin wants to merge 1 commit intomainfrom
Conversation
…tests Replace the unconditional modify pattern with observed-vs-desired diffing. Wait for stable cluster status before mutating, retry InvalidDBClusterStateFault only in scoped polling contexts, and wait for deletion to converge so replaces don't race against `deleting`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Contributor
|
Install the packages built from this commit: alchemy bun add alchemy@https://pkg.ing/alchemy/b147f49@alchemy.run/better-auth bun add @alchemy.run/better-auth@https://pkg.ing/@alchemy.run/better-auth/b147f49@alchemy.run/pr-package bun add @alchemy.run/pr-package@https://pkg.ing/@alchemy.run/pr-package/b147f49 |
Contributor
Website Preview DeployedURL: https://alchemyeffectwebsite-worker-pr-224-zdaotegsh7bknl4j.testing-2b2.workers.dev Built from commit This comment updates automatically with each push. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hardens the AWS RDS
DBClusterreconciler in line with the per-resource hardening sweep started by #184 and tracked by sibling PR #201 forDBInstance. Cluster control-plane calls fail withInvalidDBClusterStateFaultmid-transition; the previous reconciler issuedmodifyDBClusterblindly, didn't wait for stable status, and surfaced transient state-machine races as terminal errors.Reconciler changes
delete: Effect.fn(function* ({ output }) { yield* rds.deleteDBCluster({ ... }).pipe( + Effect.retry({ + while: (e) => e._tag === "InvalidDBClusterStateFault", + schedule: controlPlaneRetryPolicy, + }), Effect.catchTag("DBClusterNotFoundFault", () => Effect.void), ); + yield* waitForClusterDeleted(output.dbClusterIdentifier); }),InvalidDBClusterStateFaultis treated as retryable only in scoped contexts where we know we're polling a transitioning resource — never globally tagged retryable, since it's context-dependent (writer being modified vs. genuineConflictError).backupRetentionPeriod,preferredBackupWindow,preferredMaintenanceWindow. ExistingdeletionProtectionandcopyTagsToSnapshotare now reflected in attrs.computeModifyPayloaddiffs each mutable field (engine version, parameter group, security groups, port, IAM/HTTP endpoint, serverless v2 scaling, backup window/retention, deletion protection, copy-tags-to-snapshot) and returnsundefinedon a clean no-op so the modify call is skipped entirely.output.tags) so adoption converges.waitForClusterDeletedpolls until RDS drops the cluster, so a subsequent reconcile or replace doesn't race againstdeleting.New lifecycle tests
packages/alchemy/test/AWS/RDS/DBCluster.test.tsrunsdestroy → deploy → ... → destroyonScratchStackand asserts convergence at every step. All tests aretest.provider.skipbecause Aurora cluster create is 5–15 minutes per test; they're intended to be unskipped by hand against an isolated test account.backupRetentionPeriod/preferredBackupWindow/copyTagsToSnapshot/ internal alchemy tags mutated out-of-banddbClusterIdentifiertriggers replace; old cluster is deletedengineVersion(minor)adopt(true)re-tags a foreign clusterDistilled patch
No patch needed — the existing distilled error tagging is sufficient;
InvalidDBClusterStateFaultis intentionally not marked globally retryable since it's context-dependent (only the reconciler knows whether the transition is one it can ride out), andInsufficientDBClusterCapacityFaultrecovery takes minutes-to-hours which exceeds the default retry budget.