Skip to content

Conversation

@sollhui
Copy link
Contributor

@sollhui sollhui commented Jan 12, 2026

pick #58982

related issue: #58780

Under the load to 5000-columns with 48-bucket wide table , flush memtable consume lots of memory:

img_v3_02t2_9792f93c-bf50-41b2-b33e-055e1663143g Root cause is all column writers will be created at once when init VerticalSegmentWriter, and page_builder of _encoding_info will consume lots of memory, this pr introduces lazily creating column writer when flush memtable for partial update to solve it.

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@sollhui sollhui requested a review from yiguolei as a code owner January 12, 2026 08:49
@Thearas
Copy link
Contributor

Thearas commented Jan 12, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@sollhui
Copy link
Contributor Author

sollhui commented Jan 12, 2026

run buildall

1 similar comment
@sollhui
Copy link
Contributor Author

sollhui commented Jan 12, 2026

run buildall

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 84.62% (11/13) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.42% (18805/35202)
Line Coverage 39.23% (174454/444720)
Region Coverage 33.94% (135027/397896)
Branch Coverage 34.86% (58287/167226)

…tial update (apache#58982)

related issue: apache#58780

Under the load to 5000-columns with 48-bucket wide table , flush
memtable consume lots of memory:

![img_v3_02t2_9792f93c-bf50-41b2-b33e-055e1663143g](https://github.com/user-attachments/assets/b93d0276-0ae6-4710-a8f3-60a0f1342d3f)
Root cause is all column writers will be created at once when init
VerticalSegmentWriter, and `page_builder` of `_encoding_info ` will
consume lots of memory, this pr introduces lazily creating column writer
when flush memtable for partial update to solve it.
@sollhui
Copy link
Contributor Author

sollhui commented Jan 12, 2026

run buildall

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 44.00% (11/25) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.42% (18805/35202)
Line Coverage 39.22% (174443/444731)
Region Coverage 33.92% (134963/397921)
Branch Coverage 34.86% (58292/167241)

@yiguolei yiguolei merged commit ba970b5 into apache:branch-4.0 Jan 13, 2026
23 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants