Skip to content

[BugFix] Fix shared NullColumn in map_apply, array_length, and normalize_column_type#71258

Draft
GavinMar wants to merge 1 commit intoStarRocks:branch-4.1.0from
GavinMar:fix-shared-nullcolumn-in-normalize
Draft

[BugFix] Fix shared NullColumn in map_apply, array_length, and normalize_column_type#71258
GavinMar wants to merge 1 commit intoStarRocks:branch-4.1.0from
GavinMar:fix-shared-nullcolumn-in-normalize

Conversation

@GavinMar
Copy link
Copy Markdown
Contributor

@GavinMar GavinMar commented Apr 3, 2026

What problem does this PR solve?

Fix a crash (Check failed: _data_column->size() == _null_column->size()) caused by two NullableColumns in the same Chunk sharing the same NullColumn object.

Root Cause

The bug is triggered by an unsafe pattern for obtaining a mutable NullColumn:

// WRONG — returns the same object when use_count==1 with COW optimization
std::move(*nullable->null_column()).mutate()

// CORRECT — implicit ColumnPtr copy bumps use_count>=2, forcing clone
Column::mutate(nullable->null_column())

Crash Trigger Path

For the query:

SELECT t.map2, s.map2 FROM map_test t JOIN map_test s
ON map_apply((k,v)->(k+1,array_length(v)),s.map2) = map_apply((k,v)->(k+1,array_length(v)),t.map2)
  1. SelectOperator::push_chunk evaluates _common_exprs (e.g., map_apply(s.map2)) and appends the result as a new column (slot20) to the chunk
  2. map_apply_expr.cpp:85 uses std::move(*nullable->null_column()).mutate() to extract the input's NullColumn — with COW optimization and use_count==1, this returns the same NullColumn object
  3. The result NullableColumn (slot20) and the original column (slot4=t.map2) now share the same NullColumn in the chunk
  4. eval_conjuncts_and_in_filterseager_prune_eval_conjunctsChunk::filter()
  5. Chunk::filter() iterates columns: filtering slot4 shrinks the shared NullColumn (3→2)
  6. When accessing slot20: _data_column->size()==3 but _null_column->size()==2CHECK FAIL / CRASH

Fix

Replace all instances of the unsafe std::move(*nullable->null_column()).mutate() pattern with Column::mutate(nullable->null_column()) in:

  • map_apply_expr.cpp:85 — the primary trigger for this crash
  • array_functions.cpp:62 (array_length) — same unsafe pattern

What tests does this PR have?

Existing test: test/sql/test_join/R/test_join_map

The crash was reproducible with the map_apply join queries in this test file. After the fix, the test passes without the Check failed assertion.

Related PRs

Follow-up to #71207 which fixed the same class of bug in array_functions.cpp, binary_function.h, map_functions.cpp, and ngram.cpp — but missed the locations fixed in this PR.

…ize_column_type

When COW optimization is enabled, `std::move(*nullable->null_column()).mutate()`
returns the same NullColumn object without cloning (use_count==1), causing two
NullableColumns to share the same NullColumn. Use `Column::mutate(nullable->null_column())`
instead, which implicitly copies the ColumnPtr parameter, bumping use_count>=2
and forcing a proper clone.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant