feat(openai-agents): record response-identification metadata on LLM spans by hansmire · Pull Request #4065 · traceloop/openllmetry

hansmire · 2026-04-29T13:44:17Z

Summary

The Responses API Response object carries several fields that downstream trace backends rely on for turn chaining, model-version debugging, and reasoning / service-tier visibility, but that this instrumentor dropped on the floor. This PR plumbs them through to OTel span attributes on every LLM span.

Response field	Span attribute
`response.id`	`gen_ai.response.id`
`response.model`	`gen_ai.response.model` (kept `gen_ai.request.model` for back-compat)
`response.status`	`gen_ai.response.status`
`response.previous_response_id`	`gen_ai.request.previous_response_id`
`response.service_tier`	`gen_ai.openai.request.service_tier`
`response.reasoning.effort`	`gen_ai.request.reasoning_effort`
`response.reasoning.summary`	`gen_ai.request.reasoning_summary`

All additions are defensive: when a field is missing / None on the response, no attribute is emitted (no stringified "None" values — pinned by a regression test).

Before / After

Same 529 eval, same agent, same LLM-span detail panel in the Braintrust trace UI:

Before the LLM span's Metadata panel contains only gen_ai.request.model, temperature, top_p, usage.* — no response identification, no service tier, no reasoning config.
After the panel additionally renders gen_ai.response.id, gen_ai.response.model, gen_ai.response.status, gen_ai.openai.request.service_tier, and gen_ai.request.reasoning_effort — exposing turn chain, served-model version, and reasoning/service-tier settings for debugging.

Why

Braintrust's own native Agents SDK processor surfaces the full response.model_dump(exclude={"input","output","metadata","usage"}) on every LLM span — which is how its UI shows turn-by-turn chains and the exact model version that served each request. Previously openllmetry's equivalent openai.response span carried only temperature, top_p, max_tokens and a conflated gen_ai.request.model. Trace backends couldn't:

Chain turns (no response.id / previous_response_id) — critical for debugging agents that use auto_previous_response_id=True.
Tell what actually ran (only request.model was set; the specific served version like gpt-5.4-2026-03-05 was lost).
See request config (service tier, reasoning effort / summary all absent).

Constants

OTel semconv constants are used where available:

GenAIAttributes.GEN_AI_RESPONSE_ID
GenAIAttributes.GEN_AI_RESPONSE_MODEL
GenAIAttributes.GEN_AI_OPENAI_REQUEST_SERVICE_TIER
SpanAttributes.LLM_REQUEST_REASONING_EFFORT
SpanAttributes.LLM_REQUEST_REASONING_SUMMARY

Two fields fall back to string literals because semconv_ai doesn't publish a constant yet:

gen_ai.response.status
gen_ai.request.previous_response_id

Happy to add them upstream in semconv_ai in a follow-up if preferred.

Tests

Two new direct unit tests on _extract_response_attributes (no VCR needed):

test_extract_response_captures_response_identification_fields — feeds a SimpleNamespace with every field set, asserts each maps to the correct span attribute with the expected value.
test_extract_response_absent_fields_dont_set_attributes — regression guard for the None-passthrough branches.

All 12 tests in tests/test_openai_agents.py pass locally. uv run ruff check clean.

Notes

Part of a small series of openai-agents parity fixes (#4061 cached_tokens + reasoning_tokens, #4062 tool span type + duration, #4063 tool span input + output). Each stands alone off main and can be merged in any order.

…pans The Responses API `Response` object carries several fields that downstream trace backends rely on for turn chaining, model-version debugging and reasoning/service-tier visibility, but that this instrumentor dropped on the floor. Concretely: * `response.id` — `gen_ai.response.id` * `response.model` — `gen_ai.response.model` (kept existing `gen_ai.request.model` for back-compat) * `response.status` — `gen_ai.response.status` * `response.previous_response_id` — `gen_ai.request.previous_response_id` * `response.service_tier` — `gen_ai.openai.request.service_tier` * `response.reasoning.effort` — `gen_ai.request.reasoning_effort` * `response.reasoning.summary` — `gen_ai.request.reasoning_summary` For comparison, Braintrust's native Agents SDK processor surfaces the full `response.model_dump(exclude={"input","output","metadata","usage"})` on every LLM span — which is how its UI shows turn-by-turn chains and the exact model version that served each request. Previously openllmetry's equivalent span carried only `temperature`, `top_p`, `max_tokens` and a conflated `gen_ai.request.model`. All additions are defensive: when a field is missing / None on the response, no attribute is emitted (no stringified "None" values). Fields are set via existing OTel semconv constants where available (`GEN_AI_RESPONSE_ID`, `GEN_AI_RESPONSE_MODEL`, `GEN_AI_OPENAI_REQUEST_SERVICE_TIER`, `LLM_REQUEST_REASONING_EFFORT`, `LLM_REQUEST_REASONING_SUMMARY`) and as string literals for the two fields without published constants yet (`gen_ai.response.status`, `gen_ai.request.previous_response_id`). Includes two direct unit tests on `_extract_response_attributes` that pin the attribute mapping contract and guard against regressions when a field is absent.

CLAassistant · 2026-04-29T13:44:24Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

Max Hansmire seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

coderabbitai · 2026-04-29T13:44:26Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 808c43c0-7175-4a50-a8c6-8926e993703d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

hansmire pushed a commit to hansmire/openllmetry that referenced this pull request Apr 29, 2026

assets: add PR traceloop#4065 before/after screenshot

eccccd9

hansmire pushed a commit to hansmire/openllmetry that referenced this pull request Apr 29, 2026

assets: replace PR traceloop#4065 before/after with clean key diff

9608b5b

hansmire mentioned this pull request Apr 29, 2026

feat(openai-agents): emit structured handoffs + output_type on agent spans #4066

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(openai-agents): record response-identification metadata on LLM spans#4065

feat(openai-agents): record response-identification metadata on LLM spans#4065
hansmire wants to merge 1 commit intotraceloop:mainfrom
hansmire:fix/openai-agents-response-metadata

hansmire commented Apr 29, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Apr 29, 2026

Uh oh!

coderabbitai Bot commented Apr 29, 2026

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hansmire commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Before / After

Why

Constants

Tests

Notes

Uh oh!

CLAassistant commented Apr 29, 2026

Uh oh!

coderabbitai Bot commented Apr 29, 2026

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hansmire commented Apr 29, 2026 •

edited

Loading