Skip to content

[improve](compaction) Use segment footer raw_data_bytes for first-time batch size estimation#62263

Open
Yukang-Lian wants to merge 6 commits intoapache:masterfrom
Yukang-Lian:fix/compaction-batch-size-adaptive
Open

[improve](compaction) Use segment footer raw_data_bytes for first-time batch size estimation#62263
Yukang-Lian wants to merge 6 commits intoapache:masterfrom
Yukang-Lian:fix/compaction-batch-size-adaptive

Conversation

@Yukang-Lian
Copy link
Copy Markdown
Collaborator

@Yukang-Lian Yukang-Lian commented Apr 9, 2026

Summary

  • When vertical compaction runs for the first time on a tablet (no historical sampling data), estimate_batch_size() previously returned a hardcoded value of 992, which could cause OOM for wide tables or be too conservative for narrow tables
  • This change uses ColumnMetaPB.raw_data_bytes from segment footer to compute a per-row size estimate for the first compaction. raw_data_bytes records the original data size before encoding, which closely approximates runtime Block::bytes()
  • Historical sampling now uses Block::allocated_bytes() instead of bytes() for more accurate memory estimation (size() vs capacity())
  • Subsequent compactions with historical sampling data are completely unchanged

Key design decisions

Column type Estimation strategy
Scalar (INT/VARCHAR etc.) raw_data_bytes / rows_with_data + structural compensation (+1 null map, +8 offset)
Complex (ARRAY/MAP/STRUCT) raw_data_bytes / rows_with_data, no compensation (already includes recursive sub-writer data)
VARIANT (root/subcolumn) Fallback to 992 (raw_data_bytes=0 // TODO in writer)

Performance safeguards

  • Footer collection only runs on first compaction (no historical sampling data)
  • Skipped entirely when compaction_batch_size is manually set
  • OOM backoff and sparse optimization paths are untouched

Test plan

  • Wide table (200+ columns) first compaction does not OOM
  • Narrow table first compaction batch_size is close to upper limit
  • Multi-round compaction: first round uses footer, subsequent rounds use historical sampling
  • Variant columns fallback to 992
  • Sparse optimization is not affected
  • TestFirstCompactionUsesFooterEstimation unit test passes

…e batch size estimation

When vertical compaction runs for the first time on a tablet (no historical
sampling data), estimate_batch_size() previously returned a hardcoded value
of 992, which could cause OOM for wide tables or be too conservative for
narrow tables.

This change uses ColumnMetaPB.raw_data_bytes from segment footer to compute
a per-row size estimate for the first compaction. raw_data_bytes records the
original data size before encoding, which closely approximates runtime
Block::bytes(). Subsequent compactions continue to use the existing
historical sampling mechanism unchanged.

Key design decisions:
- Footer collection only runs when needed (no manual override, and at least
  one column group lacks historical sampling data)
- Variant columns (raw_data_bytes=0 TODO) trigger fallback to 992
- Structural overhead (+1 null map, +8 offset) only added for scalar columns
  with actual footer data
- Complex types (ARRAY/MAP/STRUCT) use raw_data_bytes directly without
  structural compensation as it already includes recursive sub-writer data
- Historical sampling now uses Block::allocated_bytes() instead of bytes()
  for more accurate memory estimation
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Yukang-Lian
Copy link
Copy Markdown
Collaborator Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 88.54% (85/96) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.63% (27379/37187)
Line Coverage 57.27% (295605/516179)
Region Coverage 54.53% (246468/451954)
Branch Coverage 56.21% (106878/190125)

…ion init

Log per_row, sample_bytes, sample_rows immediately after all merge inputs
finish loading their first block, before the actual merge starts. This helps
diagnose memory issues by showing the actual per-row memory size at init time.
The log was added to help diagnose vertical compaction memory issues.
Investigation is complete; the existing 'estimate batch size' log in
merger.cpp already provides per-group batch_size and per_row info for
daily monitoring.
@Yukang-Lian
Copy link
Copy Markdown
Collaborator Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 86.46% (83/96) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.34% (20373/38195)
Line Coverage 36.88% (192015/520641)
Region Coverage 33.19% (149367/450086)
Branch Coverage 34.30% (65335/190467)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 88.54% (85/96) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.48% (27488/37408)
Line Coverage 57.29% (297376/519044)
Region Coverage 54.47% (247437/454243)
Branch Coverage 56.08% (107127/191040)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants