Conversation
* feat: first pass at auto otel impl * refactor: clean up a bit, naming, etc. * refactor: rm instance vars * fix: rm one more instance var * chore: notes to self * traces view * minor changes * trace view by task external id * go sdk instrumentation * e2e tests for Py SDK trace --------- Co-authored-by: Mohammed Nafees <[email protected]>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
* add: otel as optional dep on ts packages * feat: opentelemetry instrumentor for TS sdk, with example * fix: lint * revert: debug print * remove: trailing space * fix: ts otel patch file path, throw handlesteprun error upstream, ts otel examples * fix: lint * feat: add schedule_workflow instrumentor, add otel conig loader tests * add: more robust wrap unwrap for patched modules * fix: lint, update version * refactor: ts otel config type assertion * revert: rebase issues * fix: lint * fix: update worker patch for ts otel with InternalWorker * fix: lint * refactor: parsejson on otel * fix: pnpm-lock * fix: lint * docs: add otel instrumented method warnings Co-authored-by: Jishnu <[email protected]>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix prepared fixes for both issues found in the latest run.
- ✅ Fixed: Wrong status enum causes infinite polling for succeeded runs
- I added
WorkflowRunStatus.SUCCEEDEDto the shared terminal-status list so succeeded workflow runs are treated as terminal and observability polling stops.
- I added
- ✅ Fixed: Division by zero when span duration is zero
- I guarded
getTimelineDataagainsttotalRange <= 0and return stable fallback percentages to avoid NaN timeline CSS values.
- I guarded
Or push these changes by commenting:
@cursor push 33d8ac818f
Preview (33d8ac818f)
diff --git a/frontend/app/src/components/v1/agent-prism/agent-prism-data.ts b/frontend/app/src/components/v1/agent-prism/agent-prism-data.ts
--- a/frontend/app/src/components/v1/agent-prism/agent-prism-data.ts
+++ b/frontend/app/src/components/v1/agent-prism/agent-prism-data.ts
@@ -49,9 +49,15 @@
maxEnd: number;
}): { durationMs: number; startPercent: number; widthPercent: number } => {
const startMs = new Date(spanCard.createdAt).getTime();
+ const durationMs = spanCard.durationNs / 1_000_000;
const totalRange = maxEnd - minStart;
- const durationMs = spanCard.durationNs / 1_000_000;
+
+ if (totalRange <= 0) {
+ return { durationMs, startPercent: 0, widthPercent: 100 };
+ }
+
const startPercent = ((startMs - minStart) / totalRange) * 100;
const widthPercent = (durationMs / totalRange) * 100;
+
return { durationMs, startPercent, widthPercent };
};
diff --git a/frontend/app/src/pages/main/v1/workflow-runs-v1/$run/v2components/step-run-detail/step-run-detail.tsx b/frontend/app/src/pages/main/v1/workflow-runs-v1/$run/v2components/step-run-detail/step-run-detail.tsx
--- a/frontend/app/src/pages/main/v1/workflow-runs-v1/$run/v2components/step-run-detail/step-run-detail.tsx
+++ b/frontend/app/src/pages/main/v1/workflow-runs-v1/$run/v2components/step-run-detail/step-run-detail.tsx
@@ -19,7 +19,12 @@
TabsList,
TabsTrigger,
} from '@/components/v1/ui/tabs';
-import { V1TaskStatus, V1TaskSummary, queries } from '@/lib/api';
+import {
+ V1TaskStatus,
+ V1TaskSummary,
+ WorkflowRunStatus,
+ queries,
+} from '@/lib/api';
import { emptyGolangUUID, formatDuration } from '@/lib/utils';
import { TaskRunActionButton } from '@/pages/main/v1/task-runs-v1/actions';
import { WorkflowDefinitionLink } from '@/pages/main/workflow-runs/$run/v2components/workflow-definition';
@@ -50,6 +55,7 @@
V1TaskStatus.CANCELLED,
V1TaskStatus.FAILED,
V1TaskStatus.COMPLETED,
+ WorkflowRunStatus.SUCCEEDED,
];
const TaskRunPermalinkOrBacklink = ({This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.
| workflowRunExternalId={id} | ||
| isRunning={ | ||
| !TASK_RUN_TERMINAL_STATUSES.includes(workflowRun.status) | ||
| } |
There was a problem hiding this comment.
Wrong status enum causes infinite polling for succeeded runs
High Severity
TASK_RUN_TERMINAL_STATUSES contains V1TaskStatus.COMPLETED (string "COMPLETED"), but workflowRun.status is a WorkflowRunStatus whose success value is SUCCEEDED (string "SUCCEEDED"). Since "SUCCEEDED" is never in TASK_RUN_TERMINAL_STATUSES, isRunning remains true after a workflow succeeds, causing the Observability component to poll every 5 seconds indefinitely.
Additional Locations (1)
| const durationMs = spanCard.durationNs / 1_000_000; | ||
| const startPercent = ((startMs - minStart) / totalRange) * 100; | ||
| const widthPercent = (durationMs / totalRange) * 100; | ||
| return { durationMs, startPercent, widthPercent }; |
There was a problem hiding this comment.
Division by zero when span duration is zero
Low Severity
In getTimelineData, if maxEnd equals minStart (possible when all spans have zero duration), totalRange is 0 and both startPercent and widthPercent become NaN. These NaN values flow into CSS left and width style properties in SpanCardTimeline, producing undefined rendering behavior.
|
You have used all of your free Bugbot PR reviews. To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial. |
|
You have used all of your free Bugbot PR reviews. To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial. |
| return fmt.Errorf("could not create admin service (v1): %w", err) | ||
| } | ||
|
|
||
| oc, err := otelcol.NewOTelCollector( |
There was a problem hiding this comment.
This is fine since this was previously a no-op and now will only be enabled when the env var is set as in below.
|
You have used all of your free Bugbot PR reviews. To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial. |
|
You have used all of your free Bugbot PR reviews. To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial. |
|
going to leave some review comments for myself then will go through and fix them |
mrkaye97
left a comment
There was a problem hiding this comment.
and last one: need a minor version for python



Description
Introduces Hatchet o11y and changes for the same to Python, TypeScript, and Go SDKs.
Type of change
Note
Medium Risk
Adds new persisted trace data (new DB table/partitioning) and new API endpoints surfaced in the UI; correctness and performance depend on batching/truncation and pagination behavior. Risk is mitigated by gating collection behind
HatchetO11y.Enabled, but changes touch engine ingest, RBAC, and frontend navigation.Overview
Adds end-to-end OpenTelemetry trace support: the engine can now ingest/store spans (new
v1_otel_tracetable + enums/partitions) via the OTLP collector with configurable max batch size and retry-count correlation, gated byHatchetO11y.Enabled.Exposes new stable APIs to fetch traces for a
taskorworkflow-run(GET .../tracewith pagination), wires them through RBAC and OpenAPI clients/models, and updates server handlers/transformers (including a CEL enum naming fix).Replaces the workflow/task-run Waterfall UI with an Observability tab that fetches all spans, builds a span tree, and renders an interactive timeline/tree view; adds new example apps for Go/Python/TypeScript OTel instrumentation and small CI/lint tweaks (enable o11y in python SDK workflow, ignore flaky OTel tests, expand linter excludes).
Written by Cursor Bugbot for commit 27aefc0. This will update automatically on new commits. Configure here.