Skip to content

fix: store coeffBits in standardBander to preserve ribbon width precision#40

Open
RajjjAryan wants to merge 3 commits intoRibbonFilter:mainfrom
RajjjAryan:fix/bander-store-coeffbits
Open

fix: store coeffBits in standardBander to preserve ribbon width precision#40
RajjjAryan wants to merge 3 commits intoRibbonFilter:mainfrom
RajjjAryan:fix/bander-store-coeffbits

Conversation

@RajjjAryan
Copy link
Copy Markdown
Contributor

Summary

backSubstitute() was inferring the ribbon width by checking whether coeffHi is nil — this correctly distinguishes w=128 from w≤64, but cannot distinguish w=32 from w=64. Both have coeffHi == nil, so w=32 was silently treated as w=64.

This caused the solution struct to carry an incorrect coeffBits value (64 instead of 32 for w=32 filters) and allocate 32 extra padding bytes per filter.

Changes

  • bander.go: Add coeffBits uint32 field to standardBander, set it in newStandardBander().
  • solver.go: Read sb.coeffBits directly in backSubstitute() instead of inferring it from coeffHi == nil.

Why

The standardBander already receives coeffBits as a constructor argument but was discarding it. Storing it explicitly is the minimal fix — no API changes, no behavioral changes for w=64/128, and correct coeffBits propagation for w=32.

All existing tests pass.

Fixes #38

…sion

backSubstitute() was inferring the ribbon width by checking whether
coeffHi is nil — this correctly distinguishes w=128 from w≤64, but
cannot distinguish w=32 from w=64. Both have coeffHi == nil, so w=32
was silently treated as w=64.

This caused the solution struct to carry an incorrect coeffBits value
(64 instead of 32 for w=32 filters) and allocate 32 extra padding
bytes per filter.

Fix: store coeffBits explicitly in standardBander (set at construction
time) and read it directly in backSubstitute instead of inferring it.

Fixes RibbonFilter#38
@DhruvilK7
Copy link
Copy Markdown
Contributor

DhruvilK7 commented Mar 20, 2026

@RajjjAryan
We should consider adding validation on coeffBits

Also, the underlying data is always of type int64
Should we add handling of int32 as well for every underlying computations?

@RajjjAryan
Copy link
Copy Markdown
Contributor Author

@DhruvilK7 Thanks for the review! Addressing both points:

1. Validation on coeffBits:

Added a defensive validation check in backSubstitute() that panics if sb.coeffBits is not 32, 64, or 128. The constructor newStandardBander already validates at creation time, but this adds a safety net at the consumption site too. Pushed in the latest commit.

2. Handling int32 for coeffBits=32:

Good observation. Currently when coeffBits=32, the code uses []uint64 for coeffLo and dispatches to the addW64/backSubst64 paths — the uint64 naturally accommodates 32-bit coefficients since the hasher's derive() masks them to the correct width via coeffMask. This is correct but does use 2× the memory needed for the coefficient array.

Adding a specialized uint32 path (addW32, backSubst32, etc.) would halve the coefficient array memory for w=32 but would require a third width-specialization across bander, solver, and filter — a non-trivial refactoring. I think that's best handled as a separate optimization PR rather than bundling it into this bug fix. Happy to open an issue to track it if you agree.

Validates that sb.coeffBits is 32, 64, or 128 at the point of use
in backSubstitute(), complementing the existing validation in the
newStandardBander constructor.

Addresses review feedback from @DhruvilK7 on PR RibbonFilter#40.
Add dedicated width-32 code paths that use []uint32 coefficient storage
and uint32 operations throughout:

- bander: add coeff32 []uint32 field, addW32(), addRangeW32() methods
  with TrailingZeros32 and uint32 XOR/shift in the inner loops.
  16 coefficients per cache line (vs 8 with uint64).
- solver: add backSubst32() using uint32 state registers and direct
  coeff32 array access.
- Update dispatchers (add, addRange, backSubstitute), reset, getSlot,
  and slowadd to handle the new coefficient storage.
- Update TestNewStandardBander to verify coeff32 allocation for w=32.

Addresses review feedback from @DhruvilK7 on PR RibbonFilter#40.
@RajjjAryan
Copy link
Copy Markdown
Contributor Author

@DhruvilK7 Implemented the uint32 specialization for coeffBits=32 in the latest commit (85a62e4). Here's a summary of the changes:

bander.go:

  • Added coeff32 []uint32 field to standardBander — allocated only for w=32, halving coefficient memory (4 bytes/slot vs 8)
  • Added addW32(): uses bits.TrailingZeros32 and uint32 XOR/shift, fitting 16 coefficients per cache line (vs 8 with uint64)
  • Added addRangeW32(): batched insertion with prefetch, same as addRangeW128 pattern (includes the dummy load for prefetching)
  • Updated dispatchers (add, addRange), reset, getSlot, slowadd to handle the new coefficient storage

solver.go:

  • Added backSubst32(): uses [8]uint32 state registers and reads coeff32 directly (no getSlot overhead)
  • Updated dispatch to a switch coeffBits with cases for 32, 128, and default (64)

bander_test.go:

  • Updated TestNewStandardBander to verify coeff32 allocation for w=32

All tests and benchmarks pass. Benchmark results on Apple M4 Pro show w=32 Add is ~4% faster than w=64 (3.145 ns/op vs 3.281 ns/op) with zero allocations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

backSubstitute cannot distinguish w=32 from w=64 — loses coeffBits precision

2 participants