Introduce GigaMap sub-queries across all index types#637
Introduce GigaMap sub-queries across all index types#637
Conversation
- Add GigaMap.SubQuery and EntityIdMatcher so queries can be combined by entity id, including ordered matchers that report the next candidate id to let the executor skip over gaps. - Expose bitmap query results as an EntityIdMatcher via the new BitmapEntityIdMatcher, so a query can participate as a sub-query in another query. - Extract EntityResolver, BitmapIterator, BitmapIteration and AbstractBitmapIterating into standalone top-level types; remove the now-obsolete GigaIteration / AbstractGigaIterating. - Add GigaIterator#nextIndexed and GigaQuery#iterateIndexed for (id, entity) iteration; wire it through BitmapIterator. - Document the newly exposed types (EntityIdMatcher, EntityResolver, BitmapIterator, BitmapIteration, BitmapEntityIdMatcher, GigaMap.SubQuery and the undocumented GigaQuery methods).
- Add GigaMap.SubQuery and EntityIdMatcher so queries can be composed by entity id. Ordered matchers report the next candidate id, letting the bitmap executor gap-skip through AND compositions. - Expose bitmap query results as an EntityIdMatcher via the new BitmapEntityIdMatcher, so any GigaQuery can act as a sub-query of another. - Add a common ScoredSearchResult supertype in org.eclipse.store.gigamap.types with a shared Entry (lazy entity lookup) and Default (XGettingList-backed, caches the id matcher). - Make VectorSearchResult and the new LuceneSearchResult extend ScoredSearchResult, so vector similarity and Lucene full-text searches compose with bitmap queries via query.and(hits). - Add LuceneIndex.search(...) returning a LuceneSearchResult (ids, scores, lazily-resolved entities) alongside the existing acceptor-based query(...) API. - Extract EntityResolver, BitmapIterator, BitmapIteration and AbstractBitmapIterating into standalone top-level types; remove the obsolete GigaIteration / AbstractGigaIterating. - Add GigaIterator#nextIndexed and GigaQuery#iterateIndexed for (id, entity) iteration; wire it through BitmapIterator. - Cover the new cross-index composition with VectorSearchSubQueryTest and LuceneSearchSubQueryTest.
- Add ScoredSearchResult.and(GigaMap.SubQuery) as the scored-side counterpart of GigaQuery.and(SubQuery). The result preserves scores and the original score-descending iteration order, so the caller can keep iterating scored entries after the narrowing: var top = vectorIndex.search(v, 50).and(gigaMap.query(status.is("PUB")));
- Cache fix in ScoredSearchResult.Default: store the sorted long[] instead of the matcher instance, and wrap a fresh AscendingListWrapper on every provideEntityIdMatcher() call. Prevents the matcher's mutable cursor from being shared across independent consumers (previously a latent bug when reusing a search result in two different queries).
- Cover both behaviors in VectorSearchSubQueryTest and LuceneSearchSubQueryTest: intersection correctness, score-order preservation, chainability of the narrowed result, reuse of the same hits across independent .and(...) calls, and empty intersections.
…cursor advancements in specific implementations.
- Explain how to combine GigaMap queries with other sub-query types, including bitmap, vector, and Lucene results. - Detail usage of `iterateIndexed` for processing results with both entity IDs and entities. - Update examples across modules to reflect changes, such as preserving scores in `ScoredSearchResult` during query narrowing.
There was a problem hiding this comment.
Pull request overview
Introduces a unified sub-query mechanism (GigaMap.SubQuery / EntityIdMatcher) to allow intersecting bitmap-backed GigaQuery results with external scored results (Lucene / JVector), and adds a shared ScoredSearchResult<E> abstraction to support score-preserving narrowing from the scored side.
Changes:
- Add
ScoredSearchResult<E>as a common, composable scored-result type; make Lucene/JVector results implement it. - Add
GigaMap.SubQuery+EntityIdMatcherand wireGigaQuery.and(SubQuery)composition through bitmap execution/iteration. - Refactor bitmap iteration/resolution internals into new top-level types and update docs/tests for sub-query composition and indexed iteration.
Reviewed changes
Copilot reviewed 30 out of 30 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| gigamap/lucene/src/test/java/org/eclipse/store/gigamap/lucene/LuceneSearchSubQueryTest.java | Adds tests for Lucene result composition as a SubQuery and score-preserving narrowing. |
| gigamap/lucene/src/main/java/org/eclipse/store/gigamap/lucene/LuceneSearchResult.java | Introduces Lucene scored-result interface extending ScoredSearchResult. |
| gigamap/lucene/src/main/java/org/eclipse/store/gigamap/lucene/LuceneIndex.java | Adds search(...) APIs returning LuceneSearchResult and builds scored entries. |
| gigamap/jvector/src/test/java/org/eclipse/store/gigamap/jvector/VectorSearchSubQueryTest.java | Adds tests for vector result composition as a SubQuery and score-preserving narrowing. |
| gigamap/jvector/src/main/java/org/eclipse/store/gigamap/jvector/VectorSearchResult.java | Refactors vector result type to extend ScoredSearchResult. |
| gigamap/jvector/src/main/java/org/eclipse/store/gigamap/jvector/VectorIndex.java | Updates conversion to produce ScoredSearchResult.Entry entries. |
| gigamap/gigamap/src/main/java/org/eclipse/store/gigamap/types/ThreadedIterator.java | Renames reader-close call site (closeIterator → closeReader). |
| gigamap/gigamap/src/main/java/org/eclipse/store/gigamap/types/ScoredSearchResult.java | Adds shared scored result abstraction with and(SubQuery) narrowing and id-matcher materialization. |
| gigamap/gigamap/src/main/java/org/eclipse/store/gigamap/types/ResultIdIterator.java | Renames reader-close call site (closeIterator → closeReader). |
| gigamap/gigamap/src/main/java/org/eclipse/store/gigamap/types/GigaQuery.java | Adds SubQuery support, removes Predicate, adds test, and wires id-matchers into execution/iteration. |
| gigamap/gigamap/src/main/java/org/eclipse/store/gigamap/types/GigaMap.java | Adds SubQuery, introduces createEntityIdMatcher, threads matcher into bitmap execution, renames active iterator tracking to readers. |
| gigamap/gigamap/src/main/java/org/eclipse/store/gigamap/types/GigaIterator.java | Removes old iterator implementation; updates wrapper to use EntityResolver and closeReader. |
| gigamap/gigamap/src/main/java/org/eclipse/store/gigamap/types/GigaIteration.java | Removes obsolete iteration type in favor of new bitmap iteration types. |
| gigamap/gigamap/src/main/java/org/eclipse/store/gigamap/types/EntityResolver.java | Extracts resolver abstraction from BitmapResult for id→entity resolution. |
| gigamap/gigamap/src/main/java/org/eclipse/store/gigamap/types/EntityIdMatcher.java | Adds matcher abstraction for composing arbitrary id sources (including ordered gap-skipping). |
| gigamap/gigamap/src/main/java/org/eclipse/store/gigamap/types/ContainsBreaker.java | Updates contains breaker to implement EntityResolver. |
| gigamap/gigamap/src/main/java/org/eclipse/store/gigamap/types/BitmapResult.java | Replaces nested Resolver with EntityResolver usage. |
| gigamap/gigamap/src/main/java/org/eclipse/store/gigamap/types/BitmapIterator.java | Adds new bitmap iterator implementation driven by AbstractBitmapIterating and EntityIdMatcher. |
| gigamap/gigamap/src/main/java/org/eclipse/store/gigamap/types/BitmapIteration.java | Adds one-shot bitmap traversal executor used by executeReadOnly. |
| gigamap/gigamap/src/main/java/org/eclipse/store/gigamap/types/BitmapEntityIdMatcher.java | Exposes bitmap query results as an ordered EntityIdMatcher (for query-as-subquery). |
| gigamap/gigamap/src/main/java/org/eclipse/store/gigamap/types/AbstractCompositeBitmapIndex.java | Updates contains path to call new execute(...) signature with EntityIdMatcher. |
| gigamap/gigamap/src/main/java/org/eclipse/store/gigamap/types/AbstractBitmapIterating.java | Refactors core bitmap traversal logic; adds matcher-aware gap skipping. |
| gigamap/gigamap/src/main/java/org/eclipse/store/gigamap/types/AbstractBitmapIndexBinary.java | Updates contains path to call new execute(...) signature with EntityIdMatcher. |
| docs/modules/gigamap/pages/queries/executing.adoc | Documents indexed iteration (iterateIndexed / nextIndexed). |
| docs/modules/gigamap/pages/queries/defining.adoc | Documents SubQuery composition patterns and fixed-id-set matching. |
| docs/modules/gigamap/pages/indexing/lucene/use-cases.adoc | Updates Lucene hybrid examples to use composable LuceneSearchResult instead of manual filtering. |
| docs/modules/gigamap/pages/indexing/lucene/index.adoc | Documents Lucene composable search(...) results and narrowing from scored side. |
| docs/modules/gigamap/pages/indexing/jvector/use-cases.adoc | Updates entry type references to ScoredSearchResult.Entry. |
| docs/modules/gigamap/pages/indexing/jvector/index.adoc | Updates vector examples to iterate ScoredSearchResult.Entry. |
| docs/modules/gigamap/pages/indexing/jvector/advanced.adoc | Updates hybrid search examples to use SubQuery intersection instead of manual id filtering. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Short-circuit single sub-query cases to avoid unnecessary wrapping. - Refactor `buildEntityIdMatcher` for improved clarity and efficiency. - Update `iterator` method to include new id range parameters (`idStart`, `idBound`).
…ty IDs - Add `idStart` and `idBound` parameters to iterator creation for range-limited processing. - Introduce `materializeEntityIds` for thread-safe, stateless entity ID matching. - Adjust multi-threaded execution logic to conditionally enable threading based on `idMatcher` usage. - Refactor sub-query handling for optimized partitioning and thread safety.
- Introduce regression tests for sub-query composition including AND semantics, range constraints, and multi-threaded execution. - Ensure proper lock handling, short-circuiting logic, and id range honoring during query execution. - Validate compatibility with indexed iteration and multi-consumer execution scenarios.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 31 out of 31 changed files in this pull request and generated 5 comments.
Comments suppressed due to low confidence (1)
gigamap/gigamap/src/main/java/org/eclipse/store/gigamap/types/AbstractBitmapIterating.java:54
- This field comment is stale:
AbstractBitmapIteratingno longer has a resolver/parent reference, but the comment still mentions a resolver not referencing the parent. Please update or remove it to avoid confusion.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…roughout the codebase - Updated all code references, including business logic, tests, and examples, to utilize the new `ScoredSearchResult.Entry` type. - Improved consistency and readability by ensuring the correct type is used for scored search results across modules. - Updated imports and ensured all tests pass with the new implementation.
- Corrected `disfunctional` to `dysfunctional` and `seperately` to `separately` to improve code comment clarity.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 41 out of 41 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Allow `null` conditions to match all entities in iterator and matcher logic. - Update `test` method to handle cases where condition is `null`.
Summary
GigaMap.SubQuery/EntityIdMatcherso bitmap, Lucene full-text and JVector similarity queries can be combined in a singleGigaQueryviaquery.and(subQuery). Ordered matchers report the next candidate id, letting the bitmap executor gap-skip through AND compositions.ScoredSearchResult<E>supertype in the core types module.VectorSearchResultand the newLuceneSearchResultextend it, acting as sub-queries on the right-hand side of aGigaQueryand offering their ownScoredSearchResult.and(SubQuery)for score-preserving narrowing from the scored side.EntityResolver,BitmapIterator,BitmapIteration,BitmapEntityIdMatcherandAbstractBitmapIteratinginto standalone top-level types; remove the obsoleteGigaIteration/AbstractGigaIterating. AddGigaIterator.nextIndexed/GigaQuery.iterateIndexedfor(id, entity)iteration.docs/modules/gigamap/to cover sub-query composition, indexed iteration, and the refreshed JVector / Lucene examples.Usage