Skip to content

fix(kubernetes): harden ServiceAccount/ConfigMap/Job/Object reconcile + add lifecycle convergence tests#235

Closed
sam-goodwin wants to merge 1 commit intomainfrom
claude/harden-k8s-rest
Closed

fix(kubernetes): harden ServiceAccount/ConfigMap/Job/Object reconcile + add lifecycle convergence tests#235
sam-goodwin wants to merge 1 commit intomainfrom
claude/harden-k8s-rest

Conversation

@sam-goodwin
Copy link
Copy Markdown
Contributor

Tighten the Kubernetes REST client and surface a few real props on the resource wrappers. Out-of-band edits and transient API failures used to fall through requestJson as untyped errors; the reconciler now reacts to them deterministically.

Reconciler changes

Typed errors per HTTP status, instead of the single KubernetesApiError catch-all:

404 -> KubernetesObjectNotFound   // idempotent delete swallows
409 -> KubernetesConflict         // retried on apply (SSA race), context-dependent on create
410 -> KubernetesGone             // non-retryable
401/403 -> KubernetesUnauthorized // non-retryable
5xx -> KubernetesServerError      // retried with exponential backoff
422/other -> KubernetesApiError   // generic, NOT auto-retried

Apply and delete now ride out transient races:

applyObject(...).pipe(
  Effect.retry({
    while: (e) => e instanceof KubernetesConflict || e instanceof KubernetesServerError,
    schedule: Schedule.exponential(Duration.millis(250)).pipe(
      Schedule.both(Schedule.recurs(5)),
    ),
  }),
);

deleteObject uses propagationPolicy: Background so dependents (Job-managed Pods) are reaped without blocking, and 404 is treated as the desired terminal state.

Apply also stops sending application/apply-patch+yaml for non-PATCH verbs:

"Content-Type": isApplyPatch(method, body)
  ? "application/apply-patch+yaml"
  : "application/json",

Resource wrappers grew the props that practical use needs:

  • ConfigMapbinaryData, immutable
  • ServiceAccountautomountServiceAccountToken, imagePullSecrets
  • Jobparallelism, completions, backoffLimit, activeDeadlineSeconds, ttlSecondsAfterFinished, annotations

JSDoc on Object now spells out that custom resources work as long as their apiVersion/kind is registered in supportedKinds.

New lifecycle tests

Pure unit tests in packages/alchemy/test/Kubernetes/client.test.ts (24 cases). Integration coverage is deferred — there is no kind/minikube fixture in this repo yet, so per-resource lifecycle convergence (no-op redeploy, drift reconcile, OOB delete recovery, name-change replace, double-destroy) cannot be exercised end-to-end without standing one up.

  • classifyKubernetesStatus — every status maps to its narrowest typed error; 422 stays untyped (context-dependent, not auto-retried)
  • isKubernetesKindSupported — accepts the six canonical kinds, rejects unknown CRDs without throwing
  • buildKubernetesObjectPath — core vs grouped vs cluster-scoped; throws on missing namespace for namespaced kinds
  • chunkByApplyRank / sortRefsForDelete — Namespace -> SA -> CM -> Job on apply, exact reverse on delete

… + add lifecycle convergence tests

- Map HTTP status to narrow typed errors (NotFound/Conflict/Gone/Unauthorized/ServerError)
- Retry SSA Conflict + 5xx with exponential backoff on applyObject/deleteObject
- Idempotent delete with propagationPolicy: Background; 404 swallowed
- Send application/apply-patch+yaml only for PATCH; JSON otherwise
- ConfigMap.binaryData / .immutable; ServiceAccount.automountServiceAccountToken / .imagePullSecrets;
  Job.parallelism / .completions / .backoffLimit / .activeDeadlineSeconds / .ttlSecondsAfterFinished / .annotations
- Pure unit tests for classifier, kind support, path builder, apply/delete ordering

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@alchemy-version-bot
Copy link
Copy Markdown
Contributor

Install the packages built from this commit:

alchemy

bun add alchemy@https://pkg.ing/alchemy/93b96fb

@alchemy.run/better-auth

bun add @alchemy.run/better-auth@https://pkg.ing/@alchemy.run/better-auth/93b96fb

@alchemy.run/pr-package

bun add @alchemy.run/pr-package@https://pkg.ing/@alchemy.run/pr-package/93b96fb

@alchemy-version-bot
Copy link
Copy Markdown
Contributor

alchemy-version-bot Bot commented May 5, 2026

Website Preview Deployed

URL: https://alchemyeffectwebsite-worker-pr-235-wfg5itvz6426xl4d.testing-2b2.workers.dev

Built from commit 93b96fb.


This comment updates automatically with each push.

@sam-goodwin
Copy link
Copy Markdown
Contributor Author

Superseded by #249 (consolidated hardening sweep). Closing — the equivalent commit landed on claude/harden-all.

@sam-goodwin sam-goodwin closed this May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant