Skip to content

fix: Don't skip spans whose parent has already been processed#1012

Open
FugiTech wants to merge 1 commit intogetsentry:masterfrom
FugiTech:group_traces
Open

fix: Don't skip spans whose parent has already been processed#1012
FugiTech wants to merge 1 commit intogetsentry:masterfrom
FugiTech:group_traces

Conversation

@FugiTech
Copy link
Contributor

@FugiTech FugiTech commented Mar 4, 2026

Description

The SpanProcessor currently tries to ignore any non-root spans. This causes it to drop child spans that start or end after the root span finishes, which feels like a bug.

This PR changes it to only ignore child-spans whose parent is still active, as then it can be bundled into one transaction.

Issues

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

child_span_records =
span_record.span_id
|> SpanStorage.get_child_spans()
|> Enum.filter(&span_complete?/1)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Descendant spans can be sent in duplicate transactions

Medium Severity

The span_complete? filter excludes incomplete children from a transaction but still includes their complete descendants (because get_all_descendants recurses through incomplete nodes). When the incomplete child later finishes and its parent is gone from storage, it becomes its own transaction root and re-collects those same descendants — sending them to Sentry twice. The remove_child_spans cleanup doesn't prevent this because it only removes direct children, not nested descendants.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FugiTech this sounds legit, WDYT? I was wondering if we should just halt pruning of roots for as long as there are in-progress children.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I believe this is a legit problem with the PR as it stands today. Codex came up with FugiTech@16924b4 to solve it, which I've been running in production for a bit and seems to work?

But the core issue is I'm not actually that familiar with the Sentry Elixir SDK so I don't know what the proper fix is. Delaying emitting the entire transaction until all children are done probably wouldn't be good enough as children could start again after that point (delayed tasks, oban, handle_event in liveviews, etc). But I don't fully understand the SpanStorage/batching characteristics of this flow.

If you're OK with it, I'd love for you to take over and implement a fix however you see fit. You're welcome to use this PR and the commit I linked above as inspiration, but I just lack the confidence to say that it'll properly fix the issue and not introduce other weird behaviors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SpanProcessor drops child spans that start after the root ends

2 participants