Commit 79faae8
fix: deflake //rs/state_layout:state_layout_test (#9203)
## Root Cause
The `checkpoints_files_are_removed_after_flushing_removal_channel` test
creates 20 checkpoints with 500 dummy files each (10,000 files total),
then removes 19 of them (9,500 file deletions) through the async removal
channel. On busy CI machines with limited I/O bandwidth, this excessive
file I/O causes the entire test binary to exceed its timeout.
The 500 files per checkpoint were intended to create backlog in the
checkpoint removal channel, so that we can do some assertions while the
backlog is still clearing (namely that the checkpoint is no longer in
the list of verified checkpoints), and other assertions after it is
cleared (namely that the files are deleted from disk).
## Fix
Reduce the dummy file count from 500 to 50 per checkpoint. This drops
total file I/O from ~10,000 to ~1000 files. It strikes a balance between
ensuring that the backlog definitely didn't clear before the assertion
and overall I/O load. If this reduction is not enough, then we can
consider adding artificial blockers to the backlog, but it seems not
necessary for this test atm.
---
This PR was created following the steps in
`.claude/skills/fix-flaky-tests/SKILL.md`.
---------
Co-authored-by: Stefan Schneider <31004026+schneiderstefan@users.noreply.github.com>1 parent 3259804 commit 79faae8
1 file changed
+2
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
446 | 446 | | |
447 | 447 | | |
448 | 448 | | |
449 | | - | |
450 | | - | |
451 | | - | |
| 449 | + | |
| 450 | + | |
452 | 451 | | |
453 | 452 | | |
454 | 453 | | |
| |||
0 commit comments