Skip to content

bug: stage matrix size dimension is not wired into runtime #19598

@forsaken628

Description

@forsaken628

Search before asking

  • I had searched in the existing issues and found no similar issue.

Version

Current main workflow configuration, observed from CI run 23394493823.

What's Wrong?

The stage job in .github/workflows/reuse.sqllogic.yml defines matrix.size: [small, large], but the workflow does not pass matrix.size into .github/actions/test_sqllogic_stage.

The called action has no size input and does not export TEST_STAGE_SIZE. tests/sqllogictests/scripts/prepare_stage.sh branches on TEST_STAGE_SIZE, and when that variable is unset it falls back to the small behavior. In practice, the large half of the matrix is very likely rerunning the same effective configuration as small.

This creates two problems:

  • duplicated CI fanout for stage
  • missing intended large coverage if the split was supposed to be meaningful

How to Reproduce?

  1. Inspect .github/workflows/reuse.sqllogic.yml and confirm that stage defines matrix.size but only passes storage, dirs, handlers, and dedup to .github/actions/test_sqllogic_stage.
  2. Inspect .github/actions/test_sqllogic_stage/action.yml and confirm that it does not accept a size input and never exports TEST_STAGE_SIZE.
  3. Inspect tests/sqllogictests/scripts/prepare_stage.sh and confirm that unset TEST_STAGE_SIZE takes the small path, while only TEST_STAGE_SIZE=large enables the alternate branch.
  4. Compare small and large stage job durations from sampled CI run 23394493823; the sampled run shows near-identical pairs, which is consistent with duplicated execution.

Evidence

  • .github/workflows/reuse.sqllogic.yml defines size in the stage matrix, but the uses: ./.github/actions/test_sqllogic_stage call only passes storage, dirs, handlers, and dedup.
  • In the same workflow, matrix.size is only used in the failure artifact name, not in runtime inputs.
  • .github/actions/test_sqllogic_stage/action.yml defines only dirs, handlers, storage, and dedup as inputs.
  • .github/actions/test_sqllogic_stage/action.yml exports TEST_HANDLERS, TEST_STAGE_STORAGE, and TEST_STAGE_DEDUP, but never exports TEST_STAGE_SIZE.
  • tests/sqllogictests/scripts/prepare_stage.sh contains the only TEST_STAGE_SIZE branch. When the variable is unset or set to small, it runs with parquet_fast_read_bytes = 1048576; only TEST_STAGE_SIZE=large switches to parquet_fast_read_bytes = 0.
  • tests/sqllogictests/src/util.rs calls prepare_stage.sh without injecting any extra environment for TEST_STAGE_SIZE.
  • In sampled run 23394493823, representative small/large pairs are almost identical in duration:
    • fs,http,full_path,large: 4.82 min
    • fs,http,full_path,small: 4.77 min
    • s3,http,full_path,large: 4.87 min
    • s3,http,full_path,small: 4.80 min
    • s3,hybrid,full_path,large: 4.45 min
    • s3,hybrid,full_path,small: 4.43 min

Expected Behavior

One of these should be true:

  • matrix.size is intentionally meaningful and is explicitly wired into runtime as TEST_STAGE_SIZE, so small and large run different configurations.
  • stage is intentionally single-mode, and the unused size matrix dimension is removed.

Suggested Fix

Pick one of these:

  1. If both modes are intended, add a size input to .github/actions/test_sqllogic_stage and export TEST_STAGE_SIZE there.
  2. If single-mode behavior is intended, remove the size matrix dimension and reduce the stage job fanout.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    agent-issueAgent-created issue marker

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions