Skip to content

NLP graph extraction#1652

Merged
natoverse merged 22 commits intomainfrom
nlp-extraction
Jan 28, 2025
Merged

NLP graph extraction#1652
natoverse merged 22 commits intomainfrom
nlp-extraction

Conversation

@natoverse
Copy link
Collaborator

Adds an NLP-based option for graph extraction. This is intended to replace the unmaintaned "nltk" strategy within the existing extraction. Adds a new CLI param "method" to specify this technique.

This particular implementation is a work-in-progress hybrid based on some early NLP-based approaches we called "FastGraphRAG". It has elements of LazyGraphRAG on the indexing side to save costs.

@natoverse natoverse requested review from a team as code owners January 22, 2025 23:48
AlonsoGuevara
AlonsoGuevara previously approved these changes Jan 28, 2025
@natoverse natoverse requested a review from ha2trinh January 28, 2025 19:59
@natoverse natoverse merged commit d31750f into main Jan 28, 2025
14 of 15 checks passed
@natoverse natoverse deleted the nlp-extraction branch January 28, 2025 20:27
opensourcemukul pushed a commit to opensourcemukul/graphrag that referenced this pull request Sep 13, 2025
* Add NLP extraction workflow

* Add text unit community summarization

* Add CLI flag for indexing method

* Regenerate poetry.lock

* Fix claims loading

* Merge fixes

* Add workflow overrides to config

* Semver

* Add graph pruning config

* Remove degree re-compute from pruning

* Switch to percentile for edge weight pruning

* Add NLP extraction config

* Add new NLP extractor options

* Add FGR workflows to util method

* Use a generator factory for workflows

* Update pruning defaults

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
Brandsma pushed a commit to ThalamusLabs/MMGraphRAG that referenced this pull request Nov 6, 2025
* Add NLP extraction workflow

* Add text unit community summarization

* Add CLI flag for indexing method

* Regenerate poetry.lock

* Fix claims loading

* Merge fixes

* Add workflow overrides to config

* Semver

* Add graph pruning config

* Remove degree re-compute from pruning

* Switch to percentile for edge weight pruning

* Add NLP extraction config

* Add new NLP extractor options

* Add FGR workflows to util method

* Use a generator factory for workflows

* Update pruning defaults

---------

Co-authored-by: Alonso Guevara <alonsog@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants