Skip to content

Docs--bulk-retries-and-cancellations#3275

Open
grutt wants to merge 8 commits intomainfrom
docs--bulk-retries-and-cancellations
Open

Docs--bulk-retries-and-cancellations#3275
grutt wants to merge 8 commits intomainfrom
docs--bulk-retries-and-cancellations

Conversation

@grutt
Copy link
Contributor

@grutt grutt commented Mar 13, 2026

Description

Docs revisions for Cancellations and Replays

Type of change

  • Documentation change (pure documentation change)

Copilot AI review requested due to automatic review settings March 13, 2026 21:21
@vercel
Copy link

vercel bot commented Mar 13, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hatchet-docs Ready Ready Preview, Comment Mar 17, 2026 10:57pm

Request Review

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates Hatchet V1 documentation and SDK examples to better explain manual replays (bulk replays) and cancellations, including new “Replays” docs and refreshed snippet-backed examples across languages.

Changes:

  • Adds a new /v1/replays docs page and updates nav/search expectations to point to it.
  • Expands /v1/cancellation docs to cover cancellation triggers, handling patterns, and bulk cancellation.
  • Adds/updates multi-language example snippets for cancelling and bulk replay/cancel operations; removes older duplicate/hidden-section doc pages and adds a redirect from the old bulk page.

Reviewed changes

Copilot reviewed 51 out of 51 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
sdks/typescript/src/v1/examples/cancellation/run.ts Updates TS snippet source for cancelling a run (used by docs snippet generator).
sdks/typescript/src/v1/examples/bulk_operations/replay.ts Adds TS snippet source for bulk replays (by IDs / by filters).
sdks/typescript/src/v1/examples/bulk_operations/cancel.ts Adds TS snippet source for bulk cancellations (by IDs / by filters).
sdks/ruby/examples/cancellation/trigger.rb Adds Ruby snippet source for cancelling a run by ID.
sdks/ruby/examples/bulk_operations/replay.rb Adds Ruby snippet source for bulk replays.
sdks/python/examples/cancellation/worker.py Adds Python snippet source for async cancellation handling; adjusts cancellation examples.
sdks/python/examples/cancellation/trigger.py Updates Python snippet source for cancelling a run by ID.
sdks/python/examples/bulk_operations/cancel.py Updates Python snippet source for async bulk cancel example sections.
sdks/go/examples/cancellations/main.go Updates Go snippet source labeling/markers for cancellation docs.
sdks/go/examples/bulk-operations/replay.go Adds Go snippet source for bulk replay (IDs + filters).
sdks/go/examples/bulk-operations/cancel.go Adds Go snippet source for bulk cancel (IDs + filters).
frontend/docs/scripts/test-search-quality.ts Updates search-quality expectations to rank the new /v1/replays page.
frontend/docs/pages/v1/replays.mdx Adds new Replays documentation page (Dashboard + bulk replay via SDK/API).
frontend/docs/pages/v1/observability/worker-healthchecks.mdx Removes old hidden/duplicate observability page (content exists in top-level v1 pages).
frontend/docs/pages/v1/observability/prometheus-metrics.mdx Removes old hidden/duplicate observability page (content exists in top-level v1 pages).
frontend/docs/pages/v1/observability/opentelemetry.mdx Removes old hidden/duplicate observability page (content exists in top-level v1 pages).
frontend/docs/pages/v1/observability/logging.mdx Removes old hidden/duplicate observability page (content exists in top-level v1 pages).
frontend/docs/pages/v1/observability/index.mdx Removes old hidden/duplicate index page.
frontend/docs/pages/v1/observability/additional-metadata.mdx Removes old hidden/duplicate additional-metadata page (top-level page remains).
frontend/docs/pages/v1/observability/_meta.js Removes hidden-section meta config (no longer needed).
frontend/docs/pages/v1/flow-control/rate-limits.mdx Removes old hidden/duplicate flow-control page (top-level page remains).
frontend/docs/pages/v1/flow-control/priority.mdx Removes old hidden/duplicate flow-control page (top-level page remains).
frontend/docs/pages/v1/flow-control/index.mdx Removes old hidden/duplicate index page.
frontend/docs/pages/v1/flow-control/concurrency.mdx Removes old hidden/duplicate flow-control page (top-level page remains).
frontend/docs/pages/v1/flow-control/_meta.js Removes hidden-section meta config (no longer needed).
frontend/docs/pages/v1/error-handling/timeouts.mdx Removes old hidden/duplicate reliability page (top-level page remains).
frontend/docs/pages/v1/error-handling/retry-policies.mdx Removes old hidden/duplicate reliability page (top-level page remains).
frontend/docs/pages/v1/error-handling/index.mdx Removes old hidden/duplicate index page.
frontend/docs/pages/v1/error-handling/cancellation.mdx Removes old hidden/duplicate cancellation page (top-level page updated instead).
frontend/docs/pages/v1/error-handling/bulk-retries-and-cancellations.mdx Removes old bulk retries/cancellations page (replaced by Replays + Cancellation bulk section).
frontend/docs/pages/v1/error-handling/_meta.js Removes hidden-section meta config (no longer needed).
frontend/docs/pages/v1/cancellation.mdx Expands cancellation docs (triggering, handling, and bulk cancellation) and updates snippet references.
frontend/docs/pages/v1/bulk-retries-and-cancellations.mdx Removes old top-level bulk retries/cancellations page (redirect added).
frontend/docs/pages/v1/advanced-tasks/streaming.mdx Removes old hidden/duplicate advanced-tasks page (top-level streaming page remains).
frontend/docs/pages/v1/advanced-tasks/index.mdx Removes old hidden/duplicate index page.
frontend/docs/pages/v1/advanced-tasks/cancellation.mdx Removes old hidden/duplicate cancellation page (top-level cancellation page updated instead).
frontend/docs/pages/v1/advanced-tasks/additional-metadata.mdx Removes old hidden/duplicate additional-metadata page (top-level page remains).
frontend/docs/pages/v1/advanced-tasks/_meta.js Removes hidden-section meta config (no longer needed).
frontend/docs/pages/v1/_meta.js Updates V1 nav labels and replaces bulk retries/cancellations entry with Replays.
frontend/docs/next.config.mjs Adds redirect from /v1/bulk-retries-and-cancellations to /v1/replays.
examples/typescript/cancellation/run.ts Updates user-facing TS example for cancelling a run.
examples/typescript/bulk_operations/replay.ts Adds user-facing TS example for bulk replay.
examples/typescript/bulk_operations/cancel.ts Adds user-facing TS example for bulk cancel.
examples/ruby/cancellation/trigger.rb Adds user-facing Ruby cancellation example.
examples/ruby/bulk_operations/replay.rb Adds user-facing Ruby bulk replay example.
examples/python/cancellation/worker.py Updates user-facing Python worker example to include async cancellation handling.
examples/python/cancellation/trigger.py Updates user-facing Python trigger example for cancelling a run.
examples/python/bulk_operations/cancel.py Updates user-facing Python bulk cancel example (now uses async APIs).
examples/go/cancellations/main.go Updates user-facing Go cancellation example labeling.
examples/go/bulk-operations/replay.go Adds user-facing Go bulk replay example.
examples/go/bulk-operations/cancel.go Adds user-facing Go bulk cancel example.
Comments suppressed due to low confidence (2)

sdks/typescript/src/v1/examples/cancellation/run.ts:44

  • since: new Date(Date.now() - 60 * 60) subtracts 3.6 seconds (because Date.now() is milliseconds). If the intent is "last hour", this should subtract milliseconds (e.g., 60 * 60 * 1000) or use a proper duration helper.
    examples/typescript/cancellation/run.ts:22
  • After cancelling run, await run.output will reject (cancellation is surfaced as an error in the result stream). As written, this example will throw and exit before printing the successful run output; handle the cancellation case explicitly (try/catch) or avoid awaiting the cancelled run’s output.

  const run1 = await cancellationWorkflow.runNoWait({});
  const res = await run.output;
  const res1 = await run1.output;

  console.log('canceled', res);

You can also share your feedback on Copilot code review. Take the survey.

@github-actions
Copy link
Contributor

Benchmark results

goos: linux
goarch: amd64
pkg: github.com/hatchet-dev/hatchet/internal/msgqueue/rabbitmq
cpu: AMD Ryzen 9 7950X3D 16-Core Processor          
                              │ /tmp/old.txt │            /tmp/new.txt            │
                              │    sec/op    │    sec/op     vs base              │
CompressPayloads_1x10KiB-8       75.93µ ± 2%   76.68µ ±  5%       ~ (p=0.240 n=6)
CompressPayloads_10x10KiB-8      894.1µ ± 3%   880.0µ ±  3%       ~ (p=0.093 n=6)
CompressPayloads_10x100KiB-8     10.72m ± 3%   10.68m ±  3%       ~ (p=0.589 n=6)
CompressPayloads_Concurrent-8    58.96µ ± 8%   60.62µ ± 20%       ~ (p=0.180 n=6)
geomean                          455.2µ        457.2µ        +0.44%

                              │ /tmp/old.txt │            /tmp/new.txt            │
                              │     B/op     │     B/op      vs base              │
CompressPayloads_1x10KiB-8      11.15Ki ± 1%   11.06Ki ± 1%       ~ (p=0.074 n=6)
CompressPayloads_10x10KiB-8     108.9Ki ± 1%   109.2Ki ± 2%       ~ (p=0.818 n=6)
CompressPayloads_10x100KiB-8    2.916Mi ± 0%   2.920Mi ± 0%       ~ (p=0.699 n=6)
CompressPayloads_Concurrent-8   54.27Ki ± 0%   54.27Ki ± 0%       ~ (p=0.818 n=6)
geomean                         118.4Ki        118.3Ki       -0.11%

                              │ /tmp/old.txt │            /tmp/new.txt            │
                              │  allocs/op   │ allocs/op   vs base                │
CompressPayloads_1x10KiB-8        5.000 ± 0%   5.000 ± 0%       ~ (p=1.000 n=6) ¹
CompressPayloads_10x10KiB-8       32.00 ± 0%   32.00 ± 0%       ~ (p=1.000 n=6) ¹
CompressPayloads_10x100KiB-8      63.00 ± 0%   63.00 ± 0%       ~ (p=1.000 n=6) ¹
CompressPayloads_Concurrent-8     17.00 ± 0%   17.00 ± 0%       ~ (p=1.000 n=6) ¹
geomean                           20.35        20.35       +0.00%
¹ all samples are equal

pkg: github.com/hatchet-dev/hatchet/internal/services/dispatcher
                  │ /tmp/new.txt │
                  │    sec/op    │
LockAcquisition-8   658.2n ± 90%

                  │ /tmp/new.txt │
                  │     B/op     │
LockAcquisition-8   334.5 ± 156%

                  │ /tmp/new.txt │
                  │  allocs/op   │
LockAcquisition-8    4.000 ± 50%

pkg: github.com/hatchet-dev/hatchet/pkg/scheduling/v1
              │ /tmp/old.txt │         /tmp/new.txt         │
              │    sec/op    │   sec/op     vs base         │
RateLimiter-8    48.87µ ± 3%   48.27µ ± 6%  ~ (p=0.818 n=6)

              │ /tmp/old.txt │         /tmp/new.txt          │
              │     B/op     │     B/op      vs base         │
RateLimiter-8   137.7Ki ± 0%   137.7Ki ± 0%  ~ (p=0.429 n=6)

              │ /tmp/old.txt │          /tmp/new.txt          │
              │  allocs/op   │  allocs/op   vs base           │
RateLimiter-8    1.022k ± 0%   1.022k ± 0%  ~ (p=1.000 n=6) ¹
¹ all samples are equal

Compared against main (a07cc9a)

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the v1 documentation to reflect the new/renamed “Replays” page and expands cancellation + bulk cancel/replay examples across SDKs, while also adjusting docs navigation and redirects to match the new page structure.

Changes:

  • Add new v1 “Replays” doc page and update search-quality expectations accordingly.
  • Expand/adjust SDK examples for cancellation and bulk cancel/replay across TypeScript, Python, Go, and Ruby (both in sdks/* snippet sources and examples/*).
  • Restructure docs navigation and add a permanent redirect from /v1/bulk-retries-and-cancellations to /v1/replays.

Reviewed changes

Copilot reviewed 52 out of 52 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
sdks/typescript/src/v1/examples/cancellation/run.ts Update TS snippet source for cancelling a run (adds cancel-by-id example).
sdks/typescript/src/v1/examples/bulk_operations/replay.ts Add TS v1 snippet source for bulk replay (IDs + filters).
sdks/typescript/src/v1/examples/bulk_operations/cancel.ts Add TS v1 snippet source for bulk cancel (IDs + filters).
sdks/ruby/examples/cancellation/trigger.rb Add Ruby v1 snippet source for cancelling a run.
sdks/ruby/examples/bulk_operations/replay.rb Add Ruby v1 snippet source for bulk replay.
sdks/python/examples/cancellation/worker.py Add Python v1 snippet showing async cancellation handling.
sdks/python/examples/cancellation/trigger.py Update Python v1 snippet for cancelling a run.
sdks/python/examples/bulk_operations/cancel.py Refactor Python v1 bulk cancel snippet toward async APIs.
sdks/go/examples/cancellations/main.go Update Go v1 snippet markers for cancellation example.
sdks/go/examples/bulk-operations/replay.go Add Go v1 snippet source for bulk replay.
sdks/go/examples/bulk-operations/cancel.go Add Go v1 snippet source for bulk cancel.
frontend/docs/scripts/test-search-quality.ts Update search-quality expected pages to point to v1/replays.
frontend/docs/pages/v1/replays.mdx Add new Replays documentation page embedding multi-SDK snippets.
frontend/docs/pages/v1/observability/worker-healthchecks.mdx Remove old observability doc page.
frontend/docs/pages/v1/observability/prometheus-metrics.mdx Remove old observability doc page.
frontend/docs/pages/v1/observability/opentelemetry.mdx Remove old observability doc page.
frontend/docs/pages/v1/observability/logging.mdx Remove old observability doc page.
frontend/docs/pages/v1/observability/index.mdx Remove old observability index content.
frontend/docs/pages/v1/observability/additional-metadata.mdx Remove old additional-metadata doc page.
frontend/docs/pages/v1/observability/_meta.js Remove old observability meta config.
frontend/docs/pages/v1/flow-control/rate-limits.mdx Remove old flow-control doc page.
frontend/docs/pages/v1/flow-control/priority.mdx Remove old flow-control doc page.
frontend/docs/pages/v1/flow-control/index.mdx Remove old flow-control index content.
frontend/docs/pages/v1/flow-control/concurrency.mdx Remove old flow-control doc page.
frontend/docs/pages/v1/flow-control/_meta.js Remove old flow-control meta config.
frontend/docs/pages/v1/error-handling/timeouts.mdx Remove old reliability doc page.
frontend/docs/pages/v1/error-handling/retry-policies.mdx Remove old reliability doc page.
frontend/docs/pages/v1/error-handling/index.mdx Remove old reliability index content.
frontend/docs/pages/v1/error-handling/cancellation.mdx Remove old cancellation doc page (under previous hierarchy).
frontend/docs/pages/v1/error-handling/bulk-retries-and-cancellations.mdx Remove old bulk-retries-and-cancellations doc page (under previous hierarchy).
frontend/docs/pages/v1/error-handling/_meta.js Remove old reliability meta config.
frontend/docs/pages/v1/cancellation.mdx Expand v1 cancellation doc page (handling + bulk cancellation sections).
frontend/docs/pages/v1/bulk-retries-and-cancellations.mdx Remove old top-level bulk-retries-and-cancellations doc page.
frontend/docs/pages/v1/advanced-tasks/streaming.mdx Remove old advanced-tasks doc page.
frontend/docs/pages/v1/advanced-tasks/index.mdx Remove old advanced-tasks index content.
frontend/docs/pages/v1/advanced-tasks/cancellation.mdx Remove old advanced-tasks cancellation page.
frontend/docs/pages/v1/advanced-tasks/additional-metadata.mdx Remove old advanced-tasks additional-metadata page.
frontend/docs/pages/v1/advanced-tasks/_meta.js Remove old advanced-tasks meta config.
frontend/docs/pages/v1/_meta.js Update v1 docs nav: rename retries label and add Replays entry; remove hidden section stubs.
frontend/docs/next.config.mjs Add redirect from old bulk retries/cancellations URL to /v1/replays.
examples/typescript/cancellation/run.ts Update TS examples mirror for cancelling a run.
examples/typescript/bulk_operations/replay.ts Add TS examples mirror for bulk replay.
examples/typescript/bulk_operations/cancel.ts Add TS examples mirror for bulk cancel.
examples/ruby/cancellation/trigger.rb Add Ruby examples mirror for cancelling a run.
examples/ruby/bulk_operations/replay.rb Add Ruby examples mirror for bulk replay.
examples/python/cancellation/worker.py Add Python examples mirror for async cancellation handling.
examples/python/cancellation/trigger.py Update Python examples mirror for cancelling a run.
examples/python/bulk_operations/cancel.py Refactor Python examples mirror toward async bulk cancel APIs.
examples/go/cancellations/main.go Update Go examples mirror cancellation header/comments.
examples/go/bulk-operations/replay.go Add Go examples mirror for bulk replay.
examples/go/bulk-operations/cancel.go Add Go examples mirror for bulk cancel.
.github/instructions/docs.instructions.md Add docs style guide for MDX authoring conventions.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +35 to +45
First, we'll start by fetching a task.

<Snippet src={snippets.python.bulk_operations.replay.setup} />

Now that we have a task, we'll get runs for it, so that we can use them to bulk replay by run id.

<Snippet src={snippets.python.bulk_operations.replay.list_runs} />

And finally, we can replay the runs in bulk.

<Snippet src={snippets.python.bulk_operations.replay.replay_by_run_ids} />
Comment on lines 13 to +17

// Or cancel by run ID via the runs client
const id = await run.runId;
await hatchet.runs.cancel({ ids: [id] });
// !!
Comment on lines +1 to +3
async def main() -> None:

from datetime import datetime, timedelta, timezone
# > Setup
Comment on lines 74 to +94
@@ -93,6 +91,7 @@ func main() {
log.Printf("failed to cancel workflow: %v", err)
}
}()
// !!
Comment on lines +10 to +23
## Triggering Cancellations

## Cancellation Mechanisms
Cancellations can be triggered in several ways:

- **SDK or REST API** - cancel a run by its ID using the runs client.
- **Dashboard** - cancel individual runs from the runs list toolbar or from within a specific run.
- **Concurrency strategies** - strategies like [`CANCEL_IN_PROGRESS`](./concurrency.mdx#cancel-in-progress) automatically cancel running tasks to free up slots when the concurrency limit is reached.
{/* TODO-DOCS: ADD VIDEO OF THE DASHBOARD VIEW */}

### Cancelling via the SDK

Cancel a run by passing its ID to the runs client, or (in TypeScript) by calling `cancel()` directly on the run reference:

<UniversalTabs items={["Python", "Typescript", "Go", "Ruby"]}>
Comment on lines +17 to +25
## Replaying from the Dashboard

{/* TODO-DOCS: Add a video of the dashboard view */}

Select runs from the runs list and click "Replay" in the toolbar, or open a specific run to inspect it and replay from there. This is useful when you want to wait for an external dependency to recover, or when you need to deploy a fix before retrying affected runs.

## Programmatic Bulk Replays

You can replay many runs at once via the SDKs and REST API. There are two approaches:
Comment on lines +1 to +3
async def main() -> None:

from datetime import datetime, timedelta, timezone
# > Setup
Comment on lines +52 to 80
For async tasks, Hatchet cancels the underlying `asyncio.Task` via the event loop. An `asyncio.CancelledError` is raised at the next `await` point, so your task is cancelled automatically. Wrap `await` calls in a `try/except asyncio.CancelledError` block if you need to run cleanup logic.

<Snippet
src={snippets.python.cancellation.worker.checking_exit_flag}
src={snippets.python.cancellation.worker.async_cancellation}
/>

For synchronous tasks, the worker cannot interrupt the thread directly. Poll `ctx.exit_flag` inside your loop to detect cancellation cooperatively.

<Snippet
src={snippets.python.cancellation.worker.checking_exit_flag}
/>

A task can also cancel itself by calling `ctx.aio_cancel()` (or `ctx.cancel()` for synchronous tasks). Any work after the cancel call will not execute.

<Snippet
src={snippets.python.cancellation.worker.self_cancelling_task}

/>

</Tabs.Tab>
<Tabs.Tab title="Typescript">
Check `ctx.cancelled` to detect whether a cancellation signal has been received.

<Snippet
src={snippets.typescript.cancellations.workflow.declaring_a_task}

/>

For tasks that make network requests or other async operations, pass `abortController.signal` to the underlying library so those operations are terminated when the task is cancelled.

<Snippet
Comment on lines +47 to +53
<Callout type="info">
Synchronous versions are also available:

- `await workflows.aio_list` -> `workflows.list`
- `await runs.aio_list` -> `runs.list`
- `await runs.aio_bulk_replay` -> `runs.bulk_replay`
</Callout>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants