Skip to content

Commit 66a6cc6

Browse files
committed
Optimize ignore_list to use set and single iteration
This pr addressed the comment by Umago, to iterate docs list once as opposed to twice. His suggestion also also optimises for O(1) membership testing instead of list. Changes: - Convert ignore_list to ignore_set for faster lookups - Single loop builds both docs_to_check and ignored_docs
1 parent 224e727 commit 66a6cc6

File tree

1 file changed

+7
-6
lines changed

1 file changed

+7
-6
lines changed

src/lightspeed_rag_content/document_processor.py

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -692,12 +692,13 @@ def process(
692692
if unreachable_action != "warn":
693693
# Separate docs into those we should check and those in ignore_list
694694
if ignore_list:
695-
docs_to_check = [
696-
doc for doc in docs if doc.metadata.get("title") not in ignore_list
697-
]
698-
ignored_docs = [
699-
doc for doc in docs if doc.metadata.get("title") in ignore_list
700-
]
695+
docs_to_check = []
696+
ignored_docs = []
697+
for doc in docs:
698+
if doc.metadata.get("title") in ignore_list:
699+
ignored_docs.append(doc)
700+
else:
701+
docs_to_check.append(doc)
701702
else:
702703
docs_to_check = docs
703704
ignored_docs = []

0 commit comments

Comments
 (0)