Feature/rmm btree#50
Conversation
Malkovsky
commented
May 24, 2026
- Implemented a RmM in a B-Tree style
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1d00f4662b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| target_in_node <= std::numeric_limits<std::int16_t>::max()) { | ||
| alignas(32) std::int16_t prefix_before[16]{}; | ||
| fill_prefix_before(node, slot, prefix_before); | ||
| return rmm_btree_match_mask_i16x16( |
There was a problem hiding this comment.
Gate undefined AVX2 match helpers before using them
When PIXIE_AVX2_SUPPORT is enabled, this branch calls rmm_btree_match_mask_i16x16/rmm_btree_match_mask_i64x4, but those symbols are not defined anywhere in the tree. As soon as fwdsearch/bwdsearch are instantiated under AVX2 (which happens with common -march=native builds), compilation fails, so AVX2-capable environments cannot build code paths that use RmMBTree search.
Useful? React with 👍 / 👎.
| if (target < 0 || target > max_excess_) { | ||
| return npos; |
There was a problem hiding this comment.
Remove nonnegative-target assumption in SDSL searches
The target < 0 guard rejects valid excess targets for general bitstrings, so forward/backward excess search can incorrectly return npos on inputs with negative prefix excess (for example, a leading 0 where fwdsearch(0, -1) should match at position 0). This silently breaks correctness for non-balanced inputs and affects benchmark comparisons that use arbitrary random bit patterns.
Useful? React with 👍 / 👎.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #50 +/- ##
==========================================
+ Coverage 87.19% 89.58% +2.38%
==========================================
Files 14 15 +1
Lines 3257 4416 +1159
Branches 607 814 +207
==========================================
+ Hits 2840 3956 +1116
+ Misses 285 280 -5
- Partials 132 180 +48
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cf70fb8db6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| } | ||
|
|
||
| public: | ||
| BitVector() = default; |
There was a problem hiding this comment.
Initialize rank state in the new default constructor
BitVector() is now defaulted, but max_rank_ has no default member initializer. Calling rank, rank0, select, or select0 on a default-constructed instance reads this indeterminate value (for example, rank(0) immediately returns max_rank_ when num_bits_ == 0), which is undefined behavior and can produce incorrect query results for any code path that constructs an empty BitVector before assigning real data.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 91bce3e5d6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| int current_excess = 0; | ||
| for (std::size_t i = 0; i < size_; ++i) { | ||
| bits_[i] = (words[i >> 6] >> (i & 63)) & 1u; | ||
| const bool bit = (words[i >> 6] >> (i & 63)) & 1u; | ||
| bits_[i] = bit; | ||
| current_excess += bit ? 1 : -1; |
There was a problem hiding this comment.
Use 64-bit excess counters for prefix tracking
current_excess and the prefix bound arrays are stored as int, so for inputs with more than ~2.1B net opening/closing steps the accumulation in this loop overflows signed 32-bit and invokes undefined behavior. Once overflow happens, prefix_excess_/bounds become corrupted and the new fwdsearch/bwdsearch pruning can return incorrect npos/positions. This is especially problematic because the library is intended for very large bit sequences; use 64-bit signed types for excess accumulation/storage.
Useful? React with 👍 / 👎.