generated from amazon-archives/__template_Custom
-
Notifications
You must be signed in to change notification settings - Fork 186
Open
Labels
cliidepending-maintainer-responseIssue is pending a response from the Kiro teamIssue is pending a response from the Kiro team
Description
Feature Request: Native OpenTelemetry trace export for agent interactions
Problem
Kiro CLI and IDE perform complex agentic workflows — LLM calls, tool invocations, file operations, reasoning steps — but there's no way to export structured telemetry about these interactions. Users building AI-powered applications need to understand agent behavior, debug failures, and measure performance using the same observability tooling they use for the rest of their stack.
Proposed Solution
Export OTLP traces for Kiro agent sessions using the OpenTelemetry GenAI Semantic Conventions.
What to trace
| Span | Key Attributes |
|---|---|
| Agent session | gen_ai.system, gen_ai.request.model, session duration |
| LLM call | gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.request.temperature, latency |
| Tool/MCP invocation | tool name, parameters (redacted), success/failure, duration |
| File operations | path, operation type, result |
Configuration
Follow standard OTel SDK conventions — env vars only, zero config files:
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
export OTEL_SERVICE_NAME=kiro-cli
kiro chatWhen OTEL_EXPORTER_OTLP_ENDPOINT is unset, no traces are exported (zero overhead by default).
Example trace
[agent_session] kiro-cli chat (12.4s) — gen_ai.system: anthropic
├── [invoke_agent] Kiro Agent (12.1s) — gen_ai.request.model: claude-sonnet-4, gen_ai.agent.name: Kiro Agent
│ ├── [gen_ai.chat] claude-sonnet-4 (2.1s) — gen_ai.usage.input_tokens: 890, gen_ai.usage.output_tokens: 210, finish_reason: tool_calls
│ ├── [execute_tool] fs_read (45ms) — gen_ai.tool.name: fs_read
│ │ └── [tools/call] fs_read (38ms) — SPAN_KIND_CLIENT
│ ├── [gen_ai.chat] claude-sonnet-4 (3.8s) — gen_ai.usage.input_tokens: 1240, gen_ai.usage.output_tokens: 520, finish_reason: tool_calls
│ ├── [execute_tool] execute_bash (1.2s) — gen_ai.tool.name: execute_bash
│ │ └── [tools/call] execute_bash (1.1s) — SPAN_KIND_CLIENT
│ ├── [gen_ai.chat] claude-sonnet-4 (4.2s) — gen_ai.usage.input_tokens: 2100, gen_ai.usage.output_tokens: 380, finish_reason: stop
│ └── [execute_tool] fs_write (12ms) — gen_ai.tool.name: fs_write
│ └── [tools/call] fs_write (8ms) — SPAN_KIND_CLIENT
└── [http send] response (5ms)
Why this matters
- Debugging — When Kiro makes unexpected changes or takes a wrong path, traces let users see exactly what happened: which model was called, what tools were invoked, and where time was spent.
- Composability — Users running Kiro as part of larger agentic systems (CI pipelines, automated workflows) need traces that connect to their existing observability backends (Jaeger, Grafana, OpenSearch, etc.).
- Dogfooding — AWS ships OpenTelemetry (ADOT, CloudWatch, X-Ray). Kiro should emit the same telemetry it helps users instrument.
- GenAI semconv adoption — The OTel GenAI semantic conventions are stabilizing. Native support in a widely-used AI coding tool drives adoption and validates the spec.
Prior Art
- Claude Code Monitoring / OTLP Support
- OpenSearch Agent Health — open-source observability and evaluation for AI agents
- OpenSearch observability-stack — OTel-native observability platform with first-class GenAI semantic convention support
- Strands Agents SDK — native OTel trace export for agent interactions
- OpenLLMetry — auto-instrumentation for LLM frameworks
- Arize Phoenix — LLM observability with OTel-compatible traces
Non-goals
- Custom UI or dashboard — users bring their own backend
- Logging/metrics export — traces first, expand later
- Always-on telemetry — opt-in only via env vars
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
cliidepending-maintainer-responseIssue is pending a response from the Kiro teamIssue is pending a response from the Kiro team