-
Notifications
You must be signed in to change notification settings - Fork 903
Description
Hi Traceloop team,
OpenLLMetry gives teams great visibility into LLM applications with OpenTelemetry, and many of those applications are RAG or RAG+agent pipelines.
I maintain WFGY RAG 16 Problem Map, an MIT-licensed project that focuses on classifying and fixing RAG / LLM failures at the pipeline level.
Repo (MIT):
https://github.com/onestardao/WFGY
Main RAG failure map page:
https://github.com/onestardao/WFGY/tree/main/ProblemMap/README.md
WFGY provides:
- A 16-class RAG failure taxonomy (retrieval, prompt, structure, infra)
- A triage prompt that takes a failing trace (Q, retrieved context, tool calls, answer, logs) and assigns one of those labels
- For each class, concrete structural fix suggestions
The same map is already integrated or cited by:
- RAGFlow and LlamaIndex in their RAG troubleshooting docs
- ToolUniverse – Harvard MIMS Lab, which wraps it in an incident triage tool
- Rankify – University of Innsbruck and Multimodal RAG Survey – QCRI LLM Lab
- Curated resources like Awesome LLM Apps and Awesome Data Science – academic
Proposal
Add WFGY’s 16-problem map as an optional, documented analysis layer on top of OpenLLMetry traces. For example:
-
A short example or recipe that:
- Filters traces for failing RAG interactions.
- Extracts relevant fields (input, retrieved context, tool calls, output).
- Calls the WFGY triage prompt and records a
rag_failure_typeattribute per trace.
-
Documentation that explains:
- The 16 failure types at a high level.
- How to slice and visualize OpenLLMetry data by
rag_failure_typein common backends (Grafana, Datadog, etc.).
This would make it easier for teams to move from raw traces to a structured understanding of “what kind of RAG failures” they are seeing.
If you consider this useful, I’m happy to propose a small example and doc text in a PR.