fix: store coeffBits in standardBander to preserve ribbon width precision#40
fix: store coeffBits in standardBander to preserve ribbon width precision#40RajjjAryan wants to merge 3 commits intoRibbonFilter:mainfrom
Conversation
…sion backSubstitute() was inferring the ribbon width by checking whether coeffHi is nil — this correctly distinguishes w=128 from w≤64, but cannot distinguish w=32 from w=64. Both have coeffHi == nil, so w=32 was silently treated as w=64. This caused the solution struct to carry an incorrect coeffBits value (64 instead of 32 for w=32 filters) and allocate 32 extra padding bytes per filter. Fix: store coeffBits explicitly in standardBander (set at construction time) and read it directly in backSubstitute instead of inferring it. Fixes RibbonFilter#38
|
@RajjjAryan Also, the underlying data is always of type |
|
@DhruvilK7 Thanks for the review! Addressing both points: 1. Validation on Added a defensive validation check in 2. Handling Good observation. Currently when Adding a specialized |
Validates that sb.coeffBits is 32, 64, or 128 at the point of use in backSubstitute(), complementing the existing validation in the newStandardBander constructor. Addresses review feedback from @DhruvilK7 on PR RibbonFilter#40.
Add dedicated width-32 code paths that use []uint32 coefficient storage and uint32 operations throughout: - bander: add coeff32 []uint32 field, addW32(), addRangeW32() methods with TrailingZeros32 and uint32 XOR/shift in the inner loops. 16 coefficients per cache line (vs 8 with uint64). - solver: add backSubst32() using uint32 state registers and direct coeff32 array access. - Update dispatchers (add, addRange, backSubstitute), reset, getSlot, and slowadd to handle the new coefficient storage. - Update TestNewStandardBander to verify coeff32 allocation for w=32. Addresses review feedback from @DhruvilK7 on PR RibbonFilter#40.
|
@DhruvilK7 Implemented the uint32 specialization for bander.go:
solver.go:
bander_test.go:
All tests and benchmarks pass. Benchmark results on Apple M4 Pro show w=32 Add is ~4% faster than w=64 (3.145 ns/op vs 3.281 ns/op) with zero allocations. |
Summary
backSubstitute()was inferring the ribbon width by checking whethercoeffHiis nil — this correctly distinguishes w=128 from w≤64, but cannot distinguish w=32 from w=64. Both havecoeffHi == nil, so w=32 was silently treated as w=64.This caused the
solutionstruct to carry an incorrectcoeffBitsvalue (64 instead of 32 for w=32 filters) and allocate 32 extra padding bytes per filter.Changes
bander.go: AddcoeffBits uint32field tostandardBander, set it innewStandardBander().solver.go: Readsb.coeffBitsdirectly inbackSubstitute()instead of inferring it fromcoeffHi == nil.Why
The
standardBanderalready receivescoeffBitsas a constructor argument but was discarding it. Storing it explicitly is the minimal fix — no API changes, no behavioral changes for w=64/128, and correctcoeffBitspropagation for w=32.All existing tests pass.
Fixes #38