Skip to content

[Feature Request] Map LEANN failure cases to WFGY 16-problem RAG debugging checklist #259

@onestardao

Description

@onestardao

Hi, thanks for releasing LEANN — the idea of learning evidence aggregation over noisy neighborhoods is very close to what many RAG practitioners are struggling with.

I maintain an open-source RAG debugging framework called WFGY ProblemMap (MIT).
It encodes 16 common failure modes for retrieval + reasoning systems, with concrete descriptions and fixes:

https://github.com/onestardao/WFGY/blob/main/ProblemMap/README.md

This ProblemMap has already been picked up by several research-style projects:

  • Harvard MIMS Lab – ToolUniverse (LLM tools benchmark; WFGY listed under robustness / RAG debugging).
  • QCRI LLM Lab – Multimodal-RAG-Survey (multimodal RAG survey repo).
  • Univ. of Innsbruck Data Science Group – Rankify (RAG toolkit with merged troubleshooting docs based on WFGY).

Given LEANN’s focus on noisy neighborhoods, a few failure modes show up especially often:

  • No.1 hallucination & chunk drift — retrieved neighbors are “plausible but wrong”.
  • No.2 interpretation collapse — neighbors are good, but aggregation + reasoning go off.
  • No.5 semantic ≠ embedding — cosine similarity selects the wrong local neighborhood.
  • No.6 logic collapse & recovery — the system gets into dead-ends and needs controlled reset.

Proposal

I’d like to propose a small, documentation-focused contribution:

  1. Add a “Debugging LEANN with WFGY ProblemMap” section to the README or a separate doc file.
    For each of the above failure modes, I can:

    • Describe the symptom in LEANN terms (noisy neighbors, unstable evidence weighting, etc.).
    • Suggest simple experiments and ablations to detect the issue (e.g. neighbor inspection, temperature / top-k sweeps, disabling certain heads).
    • Link to the relevant ProblemMap entries for deeper explanation.
  2. Optionally add a short “troubleshooting” table that LEANN users can follow when they see odd behavior:

    • Column 1: Symptom (e.g. “answers jump when I add a small number of documents”).
    • Column 2: Likely ProblemMap mode(s) (No.1, No.5, etc.).
    • Column 3: Concrete checks to run in LEANN (what to log, what to visualize).

All content would be MIT-compatible and clearly marked as optional external guidance.
If you are open to this, I can prepare a PR with a first draft and adjust based on your feedback.

Thanks for considering — I think LEANN plus a clear failure-mode map would help a lot of people who are trying to move from “it runs” to “it is debuggable”.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions