[fix](fe) Fix nested column pruning OFFSET dedup crash on array<struct>#62631
[fix](fe) Fix nested column pruning OFFSET dedup crash on array<struct>#62631englefly wants to merge 4 commits intoapache:masterfrom
Conversation
### What problem does this PR solve? Issue Number: close #xxx Related PR: apache#62205 Problem Summary: PR apache#62205 introduced the `length(str)` OFFSET optimization for nested column pruning. When `length()` is applied to a string-like nested field, the pruner marks the access path with an OFFSET suffix so BE can read only the offset array instead of full element data. However, the OFFSET path dedup logic operates at **slot granularity**: if ANY non-OFFSET path exists for the same slot, ALL OFFSET paths are stripped. This is incorrect for `array<struct<...>>` columns where different struct fields have independent access patterns. **Bug 1 (crash):** Given a query like: ```sql SELECT array_match_all(x -> length(struct_element(x, 'str_field')) > 0, arr), struct_element(element_at(arr, 1), 'int_field') FROM t ``` The `int_field` non-OFFSET path causes the `str_field` OFFSET path to be stripped. BE never reads `str_field` data → crash or wrong results. **Fix:** Replace slot-level OFFSET dedup with per-field bidirectional prefix matching. An OFFSET path `P+["OFFSET"]` is only stripped when a non-OFFSET path Q shares a prefix relationship with P (covering the same container or ancestor), not when Q accesses a sibling struct field. **Bug 2 (wrong pruning):** When `isStringOffsetOnly=true` AND `accessPartialChild=true` (e.g., `cardinality(arr) + arr[*].f1`), `pruneDataType()` returns the full type instead of pruning unused fields. **Fix:** Add `!accessPartialChild` guard to the `isStringOffsetOnly` check. ### Release note Fix a BE crash caused by nested column pruning incorrectly stripping OFFSET access paths for array<struct> columns when different struct fields have independent access patterns (e.g., length() on one field + direct access on another). ### Check List (For Author) - Test: Regression test - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/review |
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
There was a problem hiding this comment.
One blocking issue remains in the new OFFSET dedup logic.
Critical checkpoint conclusions:
- Goal of the current task: partially achieved. The array sibling-field case from the PR description is covered, but map value accesses that are spelled as
*vsVALUESare still treated as unrelated, so OFFSET stripping is not correct end-to-end. - Scope/focus: the patch is small and localized.
- Concurrency: no concurrency-sensitive code is involved here.
- Lifecycle/static init: no special lifecycle concerns.
- Config/compatibility/persistence/FE-BE protocol: no new config, compatibility, or persistence concerns.
- Parallel code paths: not fully handled. Map value access has semantically equivalent encodings (
*andVALUES), and the new raw-prefix comparison only handles one spelling. - Special conditional checks: the new
minLenprefix test is too weak because it compares raw path tokens instead of normalized semantic coverage. - Tests: the added regression covers the array crash path, but I did not find coverage for the still-broken map overlap case, e.g.
length(map_col['a'])together withmap_values(map_col)[1]. - Test result changes: the new
.outfile looks consistent with the added SQL case. - Observability: no additional logging or metrics are needed for this optimizer fix.
- Transaction/persistence/data writes: not applicable.
- Performance: no material concern in this change.
- Other issues: none beyond the blocking bug below.
| } | ||
| int minLen = Math.min(nonOffset.size(), prefix.size()); | ||
| if (minLen == 0 | ||
| || nonOffset.subList(0, minLen).equals(prefix.subList(0, minLen))) { |
There was a problem hiding this comment.
This raw-prefix comparison misses overlapping map-value accesses that are spelled with different path tokens. length(map_col['a']) is collected as [map_col, *, OFFSET], but map_values(map_col)[1] becomes [map_col, VALUES]. Those two paths still target the same value container, yet * and VALUES are not prefix-equal here, so the OFFSET path survives.
That leaves both val_col.OFFSET and val_col on the BE side. MapFileColumnIterator::set_access_paths() forwards both to the value iterator, and StringFileColumnIterator::set_access_paths() switches to OFFSET_ONLY as soon as it sees the first one, so the full-value consumer reads zero-filled strings instead of the real payload.
A query like select length(map_col['a']), map_values(map_col)[1] from ... should still suppress the OFFSET path. The dedup needs to normalize the map aliases (* vs VALUES/KEYS) or compare semantic container coverage instead of raw string prefixes.
FE UT Coverage ReportIncrement line coverage |
…LUES tokens ### What problem does this PR solve? Issue Number: close #xxx Problem Summary: When `length(map_col['a'])` and `map_values(map_col)[1]` appear in the same query, the OFFSET-only path `[map_col, *, OFFSET]` was not suppressed by the full-value path `[map_col, VALUES]` because the prefix comparison used strict string equality. `*` (element-at access) and `VALUES` (map_values access) both target the same value container but were treated as different tokens, so the OFFSET path incorrectly survived deduplication. On BE, StringFileColumnIterator would switch to OFFSET_ONLY mode and the full-value consumer would read zero-filled strings instead of the real payload. The fix introduces `prefixSemanticEquals()` which treats `*` and `VALUES` as equivalent during the prefix comparison, since both cover the map value container. ### Release note Fix incorrect OFFSET path survival when combining element_at and map_values access on the same map column, which could cause zero-filled string reads on BE. ### Check List (For Author) - Test: Regression test - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/review |
|
run buildall |
There was a problem hiding this comment.
Findings
- Blocking:
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/NestedColumnPruning.java
The new*vsVALUESsemantic equivalence is not safe in both directions.[map_col, *]is not just value access; it is also the FE signal that BE must read map keys soelement_at(map_col, key)can locate the entry. If a query mixeslength(map_col['a'])withmap_values(map_col)[1], the new dedup drops[map_col, *, OFFSET]and keeps only[map_col, VALUES]. In BE,MapFileColumnIterator::set_access_paths()then marks the key iteratorSKIP_READING, so theelement_at(map_col, 'a')insidelength()can no longer match the key and returns NULL/defaults. The new EXPLAIN-only regression test for this case therefore encodes the broken behavior instead of guarding against it.
Critical Checkpoints
- Goal: The array dedup fix and the
!accessPartialChildguard look correct, but the map*/VALUESchange does not accomplish the goal safely because it regresses mixedelement_at+map_valuesqueries. - Scope/minimality: The patch is small and focused.
- Concurrency: Not involved.
- Lifecycle/static init: Not involved.
- Config: None.
- Compatibility / FE-BE contract: Affected. The new equivalence violates the existing access-path contract in BE, where
*implies key + value access andVALUESimplies value-only access. - Parallel code paths: Not fully handled. Any path that uses
element_at(map, key)still needs keys even when another expression reads onlyVALUES. - Special conditions: The new suppression condition is missing the required asymmetry between
*andVALUES. - Test coverage: Good coverage for the array fix; insufficient for the map case because the new test only checks EXPLAIN and does not verify query results.
- Test result files: The added
.outfile appears auto-generated and consistent with the executed case. - Observability: No new observability needs.
- Transaction/persistence/data writes: Not applicable.
- FE-BE variable/path passing: No new fields, but the encoded access path is wrong for the mixed map case.
- Performance: No material concern beyond the correctness issue.
- Other issues: None beyond the blocking regression above.
Requesting changes because the current map-path suppression can produce wrong results.
| // Both cover the same value container, so treat them as equivalent for OFFSET suppression. | ||
| // Do NOT treat * as equivalent to KEYS. For example: | ||
| // select length(map_col['a']), map_keys(map_col) from t | ||
| // map_keys(map_col) only covers the key side, while length(map_col['a']) still needs |
There was a problem hiding this comment.
This comparison is not safe in both directions. A [map_col, *] path means element_at(map, key) style access, and the BE uses it to keep the map key iterator readable as well as the value iterator (see MapFileColumnIterator::set_access_paths()). [map_col, VALUES] only preserves the value side. With this change, select length(map_col['a']), map_values(map_col)[1] ... drops [map_col, *, OFFSET] and keeps only [map_col, VALUES]; BE then sets the key iterator to SKIP_READING, so the element_at(map_col, 'a') inside length() can no longer match the key and returns NULL/defaults. The * vs VALUES relation here needs to be directional, or this case needs to be rewritten to KEYS + VALUES rather than removing the * path entirely.
### What problem does this PR solve?
Issue Number: N/A
Problem Summary:
The previous OFFSET dedup fix treated map ACCESS_ALL (*) and VALUES access as fully equivalent. That is not safe when the OFFSET path comes from element_at(map_col, key): * is also the FE signal that BE must read map keys so the runtime can locate the requested entry. If length(map_col['a']) is mixed with map_values(map_col)[1], dropping [map_col, *, OFFSET] outright loses the key lookup requirement and can break the query.
This change makes the comparison type-aware and directional. When a VALUES path covers only the value side of an ACCESS_ALL OFFSET path, FE now rewrites the OFFSET path to a KEYS-only path instead of deleting it. That keeps the BE key iterator alive while still removing the redundant value-side OFFSET path.
### Release note
Fix nested-column pruning for mixed map element_at and map_values access so key lookup remains correct while redundant OFFSET reads are removed.
### Check List (For Author)
- Test: FE Unit Test, Regression test
- FE Unit Test: ./run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.PruneNestedColumnTest
- Regression test: bash run-regression-test.sh --run -d nereids_rules_p0 -s string_length_column_pruning
- Behavior changed: Yes (mixed map element_at + map_values now preserves the KEYS access path)
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/review |
|
run buildall |
There was a problem hiding this comment.
Requesting changes because one correctness gap remains.
Critical checkpoints:
- Goal/task: The PR aims to make OFFSET-path dedup safe for mixed nested accesses. It fixes the top-level ARRAY/MAP cases covered by the new tests, but it does not yet cover the same access pattern when the container is nested under a STRUCT slot.
- Scope/focus: The change is localized to NestedColumnPruning, but the new guard
if (slot.getDataType().isArrayType() || slot.getDataType().isMapType())scopes the rewrite too narrowly. - Concurrency: Not applicable; this is FE-only rewrite logic with no new concurrent state.
- Lifecycle/config/compatibility: No new lifecycle, config, or upgrade concerns found.
- Parallel code paths: The BE access-path contract is the same for array/map children inside structs, so those paths need the same suppression/rewrite behavior.
- Special conditions: The current special handling only triggers when the root slot itself is ARRAY/MAP; nested containers under STRUCT still forward conflicting OFFSET and full-data paths together.
- Test coverage: Added FE unit/regression tests cover top-level arr_struct and map_col cases, but they do not cover STRUCT<arr/map ...> cases where the bug still reproduces.
- Observability/persistence/transactions/FE-BE variables: No additional blocking issues found.
- Validation: I attempted
bash run-fe-ut.sh --run org.apache.doris.nereids.rules.rewrite.PruneNestedColumnTest, but this runner is missingthirdparty/installed/bin/protoc, so I could not rerun FE UT locally.
See the inline comment for the concrete failing shape.
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
|
run external |
FE Regression Coverage ReportIncrement line coverage |
|
/review |
|
run buildall |
There was a problem hiding this comment.
Blocking finding:
regression-test/conf/regression-conf.groovywas committed with worktree-local port overrides. This file is explicitly local-only andAGENTS.mdsays env/run config changes must not be included in the PR.
Critical checkpoint conclusions:
- Goal of the task: The FE pruning change appears to address the reported
array<struct>sibling-field case and the mixedelement_at(map)+map_values(map)case. The added FE unit tests and regression tests exercise those paths. - Small / focused change: Not fully. The code fix itself is focused, but the PR also includes an unrelated local regression config edit.
- Concurrency: Not involved. I did not find shared-state or lock-order concerns in this change.
- Lifecycle / static initialization: Not involved.
- Configuration items: No new product config is added, but a checked-in regression harness config file was modified with local ports and should not be merged.
- Compatibility: No FE/BE protocol, symbol, or storage-format compatibility issue found in the pruning logic.
- Parallel code paths: I traced the collector ->
NestedColumnPruning-> slot replacement -> BE map/array/struct iterator flow. The new path-rewrite logic is consistent with BE's distinction between*,KEYS,VALUES, andOFFSET. - Special conditionals: The new type-aware
compareOffsetPrefixCoverage()logic is justified for the affected map/array cases. - Test coverage: Coverage for the newly fixed select-list cases is good in both FE UT and regression tests. I did not find another concrete blocker in the code path.
- Test result files: The new
.outfiles are consistent with the neworder_qtcases. - Observability: Not needed for this optimizer-side fix.
- Transaction / persistence / data write / FE-BE variable passing: Not applicable here.
- Performance: The additional per-slot path comparison work is small and acceptable for the expected number of collected access paths.
- Other issues: None beyond the accidental config check-in.
| // add allowLoadLocalInfile so that the jdbc can execute mysql load data from client. | ||
| jdbcUrl = "jdbc:mysql://127.0.0.1:9030/?useLocalSessionState=true&allowLoadLocalInfile=true&zeroDateTimeBehavior=round" | ||
| targetJdbcUrl = "jdbc:mysql://127.0.0.1:9030/?useLocalSessionState=true&allowLoadLocalInfile=true&zeroDateTimeBehavior=round" | ||
| jdbcUrl = "jdbc:mysql://127.0.0.1:39131/?useLocalSessionState=true&allowLoadLocalInfile=true&zeroDateTimeBehavior=round" |
There was a problem hiding this comment.
This looks like worktree-local test setup rather than PR content. regression-test/conf/regression-conf.groovy is the shared default harness config, and AGENTS.md explicitly says env/run config changes should not be committed. If this lands, CI and other developers will start targeting 39131/39121/38131 instead of the repo defaults. Please drop this file from the PR.
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #62205
Problem Summary:
PR #62205 introduced the
length(str)OFFSET optimization for nested column pruning. Whenlength()is applied to a string-like nested field, the pruner marks the access path with an OFFSET suffix so BE can read only the offset array instead of full element data.However, the OFFSET path dedup logic operates at slot granularity: if ANY non-OFFSET path exists for the same slot, ALL OFFSET paths are stripped. This is incorrect for
array<struct<...>>columns where different struct fields have independent access patterns.Bug 1 (crash): Given a query like:
The
int_fieldnon-OFFSET path causes thestr_fieldOFFSET path to be stripped. BE never readsstr_fielddata → crash or wrong results.Fix: Replace slot-level OFFSET dedup with per-field bidirectional prefix matching. An OFFSET path
P+["OFFSET"]is only stripped when a non-OFFSET path Q shares a prefix relationship with P (covering the same container or ancestor), not when Q accesses a sibling struct field.Bug 2 (wrong pruning): When
isStringOffsetOnly=trueANDaccessPartialChild=true(e.g.,cardinality(arr) + arr[*].f1),pruneDataType()returns the full type instead of pruning unused fields.Fix: Add
!accessPartialChildguard to theisStringOffsetOnlycheck.Release note
Fix a BE crash caused by nested column pruning incorrectly stripping OFFSET access paths for array columns when different struct fields have independent access patterns (e.g., length() on one field + direct access on another).
Check List (For Author)
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)