fix: [BUG] no partition of relation "v1_payload" found for row #2803 by Akhilesh29 · Pull Request #3328 · hatchet-dev/hatchet

Akhilesh29 · 2026-03-19T10:58:08Z

Description

When replaying a task whose inserted_at falls outside all existing v1_payload partition boundaries (typically because the task is older than the retention window), Postgres throws SQLSTATE 23514 — no partition of relation "v1_payload" found for row. Previously this error bubbled all the way back to the RabbitMQ consumer which treated it as a transient failure and retried indefinitely, flooding the logs.

This fix catches the partition error at two points in the replay path and skips gracefully with a warning log instead of returning an error that triggers endless retries.

Type of change

Bug fix (non-breaking change which fixes an issue)

What's Changed

In pkg/repository/v1/task.go — catch SQLSTATE 23514 in replayTasks() after payloadStore.Store, log a warning and continue instead of returning the error
In internal/services/controllers/v1/task_controller.go — catch SQLSTATE 23514 in handleReplayTasks() after ReplayTasks(), log a warning and continue instead of returning the error which triggered infinite RabbitMQ retries
Add shared isPostgresPartitionError helper in both files to detect the partition constraint violation by SQLSTATE code

Note

Medium Risk
Changes error handling in the task replay path to swallow specific Postgres partition failures; incorrect detection or overly-broad matching could hide real replay errors or skip legitimate replays. Also introduces new imports/helpers that must compile correctly across packages.

Overview
Prevents infinite RabbitMQ retries when replaying tasks that fall outside existing v1_payload partitions by catching Postgres 23514 partition errors and skipping replay with a warning.

This adds isPostgresPartitionError checks in both the task controller (handleReplayTasks) and repository replay flow (after payloadStore.Store) so old tasks that can’t be persisted due to missing partitions don’t bubble up as consumer errors.

^{Written by Cursor Bugbot for commit 773e53c. This will update automatically on new commits. Configure here.}

…t-dev#2803

vercel · 2026-03-19T10:58:21Z

@Akhilesh29 is attempting to deploy a commit to the Hatchet Team on Vercel.

A member of the Team first needs to authorize it.

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is kicking off a free cloud agent to fix these issues. This run is complimentary, but you can enable autofix for all future PRs in the Cursor dashboard.}

cursor · 2026-03-19T11:04:35Z

pkg/repository/task.go

+        continue
+    }
+    return nil, fmt.Errorf("failed to store payloads for step id %s: %w", stepId, err)
+}


Replay modifies task state before partition error is caught

High Severity

When payloadStore.Store fails with a partition error, the continue skips storing the payload, but r.queries.ReplayTasks on line 2520 has already UPDATEd the v1_task rows in the same transaction (incrementing retry_count, resetting initial_state, etc.). The transaction still commits at line 3414, and the affected tasks remain in the outer replayedTasks list (built at line 3079 before replayTasks is called). This leaves tasks in an inconsistent state — their DB state is modified as "replayed" but no payload exists — and they're signaled to the controller as successfully replayed.

Additional Locations (1)

pkg/repository/task.go#L3413-L3422

Bugbot Autofix determined this is a false positive.

Current replayTasks returns the payload-store error immediately so the enclosing transaction rolls back and no replayed task state is committed without payloads.

_{This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.}

pkg/repository/task.go

Akhilesh29 added 2 commits March 19, 2026 16:08

fix: [BUG] no partition of relation "v1_payload" found for row hatche…

d669748

…t-dev#2803

fix: [BUG] no partition of relation "v1_payload" found for row hatche…

773e53c

…t-dev#2803

cursor bot reviewed Mar 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: [BUG] no partition of relation "v1_payload" found for row #2803#3328

fix: [BUG] no partition of relation "v1_payload" found for row #2803#3328
Akhilesh29 wants to merge 2 commits intohatchet-dev:mainfrom
Akhilesh29:main

Akhilesh29 commented Mar 19, 2026 •

edited by cursor bot

Loading

Uh oh!

vercel bot commented Mar 19, 2026

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Mar 19, 2026

Uh oh!

cursor bot Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Akhilesh29 commented Mar 19, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

What's Changed

Uh oh!

vercel bot commented Mar 19, 2026

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 19, 2026

Choose a reason for hiding this comment

Replay modifies task state before partition error is caught

Uh oh!

cursor bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Akhilesh29 commented Mar 19, 2026 •

edited by cursor bot

Loading