-
Notifications
You must be signed in to change notification settings - Fork 860
Description
Search before asking
- I had searched in the existing issues and found no similar issue.
Version
Current main workflow configuration, observed from CI run 23394493823.
What's Wrong?
The stage job in .github/workflows/reuse.sqllogic.yml defines matrix.size: [small, large], but the workflow does not pass matrix.size into .github/actions/test_sqllogic_stage.
The called action has no size input and does not export TEST_STAGE_SIZE. tests/sqllogictests/scripts/prepare_stage.sh branches on TEST_STAGE_SIZE, and when that variable is unset it falls back to the small behavior. In practice, the large half of the matrix is very likely rerunning the same effective configuration as small.
This creates two problems:
- duplicated CI fanout for
stage - missing intended
largecoverage if the split was supposed to be meaningful
How to Reproduce?
- Inspect
.github/workflows/reuse.sqllogic.ymland confirm thatstagedefinesmatrix.sizebut only passesstorage,dirs,handlers, anddedupto.github/actions/test_sqllogic_stage. - Inspect
.github/actions/test_sqllogic_stage/action.ymland confirm that it does not accept asizeinput and never exportsTEST_STAGE_SIZE. - Inspect
tests/sqllogictests/scripts/prepare_stage.shand confirm that unsetTEST_STAGE_SIZEtakes thesmallpath, while onlyTEST_STAGE_SIZE=largeenables the alternate branch. - Compare
smallandlargestagejob durations from sampled CI run23394493823; the sampled run shows near-identical pairs, which is consistent with duplicated execution.
Evidence
.github/workflows/reuse.sqllogic.ymldefinessizein thestagematrix, but theuses: ./.github/actions/test_sqllogic_stagecall only passesstorage,dirs,handlers, anddedup.- In the same workflow,
matrix.sizeis only used in the failure artifact name, not in runtime inputs. .github/actions/test_sqllogic_stage/action.ymldefines onlydirs,handlers,storage, anddedupas inputs..github/actions/test_sqllogic_stage/action.ymlexportsTEST_HANDLERS,TEST_STAGE_STORAGE, andTEST_STAGE_DEDUP, but never exportsTEST_STAGE_SIZE.tests/sqllogictests/scripts/prepare_stage.shcontains the onlyTEST_STAGE_SIZEbranch. When the variable is unset or set tosmall, it runs withparquet_fast_read_bytes = 1048576; onlyTEST_STAGE_SIZE=largeswitches toparquet_fast_read_bytes = 0.tests/sqllogictests/src/util.rscallsprepare_stage.shwithout injecting any extra environment forTEST_STAGE_SIZE.- In sampled run
23394493823, representativesmall/largepairs are almost identical in duration:fs,http,full_path,large:4.82 minfs,http,full_path,small:4.77 mins3,http,full_path,large:4.87 mins3,http,full_path,small:4.80 mins3,hybrid,full_path,large:4.45 mins3,hybrid,full_path,small:4.43 min
Expected Behavior
One of these should be true:
matrix.sizeis intentionally meaningful and is explicitly wired into runtime asTEST_STAGE_SIZE, sosmallandlargerun different configurations.stageis intentionally single-mode, and the unusedsizematrix dimension is removed.
Suggested Fix
Pick one of these:
- If both modes are intended, add a
sizeinput to.github/actions/test_sqllogic_stageand exportTEST_STAGE_SIZEthere. - If single-mode behavior is intended, remove the
sizematrix dimension and reduce thestagejob fanout.
Are you willing to submit PR?
- Yes I am willing to submit a PR!