[Feature Request] Map LEANN failure cases to WFGY 16-problem RAG debugging checklist

Hi, thanks for releasing LEANN — the idea of learning evidence aggregation over noisy neighborhoods is very close to what many RAG practitioners are struggling with.

I maintain an open-source RAG debugging framework called **WFGY ProblemMap** (MIT).  
It encodes 16 common failure modes for retrieval + reasoning systems, with concrete descriptions and fixes:

https://github.com/onestardao/WFGY/blob/main/ProblemMap/README.md

This ProblemMap has already been picked up by several research-style projects:

- **Harvard MIMS Lab – ToolUniverse** (LLM tools benchmark; WFGY listed under robustness / RAG debugging).
- **QCRI LLM Lab – Multimodal-RAG-Survey** (multimodal RAG survey repo).
- **Univ. of Innsbruck Data Science Group – Rankify** (RAG toolkit with merged troubleshooting docs based on WFGY).

Given LEANN’s focus on noisy neighborhoods, a few failure modes show up especially often:

- **No.1 hallucination & chunk drift** — retrieved neighbors are “plausible but wrong”.
- **No.2 interpretation collapse** — neighbors are good, but aggregation + reasoning go off.
- **No.5 semantic ≠ embedding** — cosine similarity selects the wrong local neighborhood.
- **No.6 logic collapse & recovery** — the system gets into dead-ends and needs controlled reset.

### Proposal

I’d like to propose a small, documentation-focused contribution:

1. **Add a “Debugging LEANN with WFGY ProblemMap” section** to the README or a separate doc file.  
   For each of the above failure modes, I can:
   - Describe the symptom in LEANN terms (noisy neighbors, unstable evidence weighting, etc.).
   - Suggest simple experiments and ablations to detect the issue (e.g. neighbor inspection, temperature / top-k sweeps, disabling certain heads).
   - Link to the relevant ProblemMap entries for deeper explanation.

2. **Optionally add a short “troubleshooting” table** that LEANN users can follow when they see odd behavior:
   - Column 1: Symptom (e.g. “answers jump when I add a small number of documents”).
   - Column 2: Likely ProblemMap mode(s) (No.1, No.5, etc.).
   - Column 3: Concrete checks to run in LEANN (what to log, what to visualize).

All content would be MIT-compatible and clearly marked as optional external guidance.  
If you are open to this, I can prepare a PR with a first draft and adjust based on your feedback.

Thanks for considering — I think LEANN plus a clear failure-mode map would help a lot of people who are trying to move from “it runs” to “it is debuggable”.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Map LEANN failure cases to WFGY 16-problem RAG debugging checklist #259

Proposal

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Feature Request] Map LEANN failure cases to WFGY 16-problem RAG debugging checklist #259

Description

Proposal

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions