db: don't SeekPrefixGE level iterator past prefix#5786
db: don't SeekPrefixGE level iterator past prefix#5786RaduBerinde wants to merge 4 commits intocockroachdb:masterfrom
Conversation
We were passing the full key as the prefix in some tests.
Passing a bool value is very opaque, use a more explicit string.
788fd7b to
e164685
Compare
Potential Bug(s) DetectedThe three-stage Claude Code analysis has identified potential bug(s) in this PR that may warrant investigation. Next Steps: Note: When viewing the workflow output, scroll to the bottom to find the Final Analysis Summary. After you review the findings, please tag the issue as follows:
|
e164685 to
b46df2f
Compare
Add a test that illustrates the behavior of the iterator when a level has a tombstone that extends past the prefix.
Currently the merging iterator can call `SeekPrefixGE` with a key that
does not have the given prefix. Allowing this is counter-intuitive and
adds subtlety to the lower level iterators, making work on optimizing
the `SeekPrefixGE` path more difficult. This is also against the
contract documented in InternalIterator.SeekPrefixGE ("the supplied
prefix will be the prefix of the given key returned by that Split
function").
We change the code to avoid this case. When tombstones covers all keys
with the given prefix on a level, `SeekPrefixGE` no longer seeks that
level (and those below). In order to keep the `TrySeekUsingNext`
optimization working, we remember which levels have not been
positioned using a flag and we position those without
`TrySeekUsingNext` when necessary. We are either doing the same seek
at a later time, or we are doing a single seek to a later key.
b46df2f to
c2e7983
Compare
sumeerbhola
left a comment
There was a problem hiding this comment.
@sumeerbhola partially reviewed 11 files and made 4 comments.
Reviewable status: 9 of 15 files reviewed, 4 unresolved discussions (waiting on RaduBerinde).
- This sounds like a revisit of #3845. Is that correct? I'll need to swap in that state and then look at this code, so will need some time.
- Is there a pressing bug that motivated this, so I can figure out reviewing prioritization?
- Is it possible to take the first 3 commits, merge them in a separate PR. And get to the bottom of the interleavingIter oddity first. I am more nervous about any new bugs that may have been recently introduced, and less about this seek past the prefix behavior.
sstable/testdata/block_properties line 579 at r2 (raw file):
seek-ge c@19 try-seek-using-next ---- strconv.ParseBool: parsing "try-seek-using-next": invalid syntax
invalid syntax?
sstable/testdata/prefixreader/iter line 266 at r2 (raw file):
seek-prefix-ge c ---- strconv.ParseBool: parsing "try-seek-using-next": invalid syntax
ditto
testdata/treesteps/merging_iter line 124 at r3 (raw file):
# TODO(radu): the InterleavingIter returns `b-b\x00:{(#4,RANGEDEL)}` instead of # returning a tombstone up to d; this seems suspicious.
Could this be a bug introduced by the recent changes in this area?
RaduBerinde
left a comment
There was a problem hiding this comment.
@RaduBerinde made 2 comments.
Reviewable status: 9 of 15 files reviewed, 4 unresolved discussions (waiting on sumeerbhola).
Previously, sumeerbhola wrote…
- This sounds like a revisit of #3845. Is that correct? I'll need to swap in that state and then look at this code, so will need some time.
- Is there a pressing bug that motivated this, so I can figure out reviewing prioritization?
- Is it possible to take the first 3 commits, merge them in a separate PR. And get to the bottom of the interleavingIter oddity first. I am more nervous about any new bugs that may have been recently introduced, and less about this seek past the prefix behavior.
Correct. No pressing bug, just something I've been wanting to get back to. Yes, I will separate the other commits then investigate the InterleavingIter.
testdata/treesteps/merging_iter line 124 at r3 (raw file):
Previously, sumeerbhola wrote…
Could this be a bug introduced by the recent changes in this area?
I think so.
RaduBerinde
left a comment
There was a problem hiding this comment.
@RaduBerinde made 1 comment.
Reviewable status: 9 of 15 files reviewed, 4 unresolved discussions (waiting on sumeerbhola).
testdata/treesteps/merging_iter line 124 at r3 (raw file):
Previously, RaduBerinde wrote…
I think so.
Hm, this block of code has been here for a long time:
I think this simply doesn't work the way we thought it's working - we thought that once a mid-level sees a long tombstone, the lower levels would be seeked beyond that tombstone and they won't be touched again for many seeks-using-next. But it looks like we only seek right past the prefix.
|
Previously, RaduBerinde wrote…
It is required by the |
sumeerbhola
left a comment
There was a problem hiding this comment.
@sumeerbhola made 1 comment.
Reviewable status: 9 of 15 files reviewed, 4 unresolved discussions (waiting on RaduBerinde).
testdata/treesteps/merging_iter line 124 at r3 (raw file):
Previously, RaduBerinde wrote…
It is required by the
Iterator.SeekGEPrefixcontract, although it could probably be truncated by the Iterator:// When iterating with range keys enabled, all range keys encountered are // truncated to the seek key's prefix's bounds. The truncation of the upper // bound requires that the database's Comparer is configured with a // ImmediateSuccessor method. For example, a SeekPrefixGE("a@9") call with the // prefix "a" will truncate range key bounds to [a,ImmediateSuccessor(a)]. func (i *Iterator) SeekPrefixGE(key []byte) bool {
Prior to Jackson's recent changes didn't we have a separate rangedel iterator to surface the actual rangedels, and the interleaving iter was just being used to surface all the boundaries, to prevent switching to a new sstable too early? And the rangedel iterator would surface the full rangedel?
I am basing the above on my memory of a code walkthrough meeting 2 weeks ago -- I haven't looked at Jackson's recent changes.
RaduBerinde
left a comment
There was a problem hiding this comment.
@RaduBerinde made 1 comment.
Reviewable status: 9 of 15 files reviewed, 4 unresolved discussions (waiting on sumeerbhola).
Previously, RaduBerinde wrote…
Correct. No pressing bug, just something I've been wanting to get back to. Yes, I will separate the other commits then investigate the InterleavingIter.
I will close this for now.
itertest: fix SeekPrefixGE call in tests
We were passing the full key as the prefix in some tests.
itertest: use "try-seek-using-next" instead of "true"
Passing a bool value is very opaque, use a more explicit string.
db: add merging iter treesteps test with tombstone
Add a test that illustrates the behavior of the iterator when a level
has a tombstone that extends past the prefix.
db: don't SeekPrefixGE level iterator past prefix
Currently the merging iterator can call
SeekPrefixGEwith a key thatdoes not have the given prefix. Allowing this is counter-intuitive and
adds subtlety to the lower level iterators, making work on optimizing
the
SeekPrefixGEpath more difficult. This is also against thecontract documented in InternalIterator.SeekPrefixGE ("the supplied
prefix will be the prefix of the given key returned by that Split
function").
We change the code to avoid this case. When tombstones covers all keys
with the given prefix on a level,
SeekPrefixGEno longer seeks thatlevel (and those below). In order to keep the
TrySeekUsingNextoptimization working, we remember which levels have not been
positioned using a flag and we position those without
TrySeekUsingNextwhen necessary. We are either doing the same seekat a later time, or we are doing a single seek to a later key.