-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Closed
Closed
Copy link
Labels
Description
Description
I have been digging into the AcceptDocs API and I noticed the following from the java docs:
/**
* Return an approximation of the number of accepted documents. This is typically useful to decide
* whether to consume these accept docs using random access ({@link #bits()}) or sequential access
* ({@link #iterator()}).
*
* <p><b>NOTE</b>: This must not be called after {@link #iterator()}.
*
* @return approximate cost
*/
public abstract int cost() throws IOException;
However the implementation for the most common non-cached iterator:
public int cost() throws IOException {
createBitSetAcceptDocsIfNecessary();
return acceptBitSet.cardinality();
}
Actually fully consumes the iterator and just calls cardinality (nothing approximate at all...).
Why are we doing that? Why aren't we relying on DocIdSetIterator#cost or at least acceptBitSet.cardinality?
It seems to me the main idea behind AcceptDocs is the ability to bypass realizing the bitset and to just iterate as normal when the filter is very restrictive...
//cc @shubhamvishu
Version and environment details
No response
Reactions are currently unavailable