Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
5b8b125 to
8f69d42
Compare
|
run buildall |
|
/review |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
|
/review |
There was a problem hiding this comment.
Code Review: Support Iceberg V3 Row Lineage
Summary
This PR adds Iceberg V3 row lineage support including deletion vectors (Puffin format), _row_id and _last_updated_sequence_number hidden columns, and rewrite_data_files action. The feature covers both FE planning/transaction and BE reading/writing paths across Parquet and ORC formats.
Critical Issue Found
[BUG] IcebergRewritableDeletePlanner.java: setDeleteFiles(value) is never called — The thrift struct TIcebergRewritableDeleteFileSet is populated with setReferencedDataFilePath(key) but the value (the list of TIcebergDeleteFileDesc) from deleteFilesDescByReferencedDataFile is never set on the struct. On the BE side (viceberg_delete_sink.cpp), the code checks !delete_file_set.__isset.delete_files and skips entries without delete files, so previously existing position deletes and deletion vectors are never loaded for merging. This means when a second DELETE is issued against the same data file on a V3 table, the old deleted rows won't be merged into the new DV, causing previously deleted rows to reappear (data correctness regression).
Other Issues
-
file_scanner.cpp: Usesstarts_with("_row_id")andstarts_with("_last_updated_sequence_number")instead of exact match (==). While unlikely to cause false matches in practice, it's inconsistent with the adjacent== BeConsts::ICEBERG_ROWID_COLpattern and could match unintended column names. Should use==for correctness. -
Missing test coverage: No regression test covers the scenario of issuing multiple DELETEs against the same data file, which is the exact path where the critical bug manifests. Adding such a test would have caught this.
Code Review Checkpoint Conclusions (Part 1.3)
- Goal: The code implements V3 row lineage with deletion vectors, hidden columns, and rewrite support. A critical bug in delete merging prevents correct behavior for repeated deletes on the same data file.
- Modification scope: Large (54 files, 3222+/236-) but focused on a single cohesive feature.
- Concurrency: No new concurrency introduced; single-threaded pipeline operator paths.
- Lifecycle management: Standard RAII patterns used; no special concerns.
- Configuration items: None added.
- Incompatible changes: Thrift changes are additive (optional fields, new enum). Forward-compatible for rolling upgrades.
- Parallel code paths: Parquet and ORC reader paths both updated consistently.
- Special conditional checks:
starts_withshould be==infile_scanner.cpp. - Test coverage: 4 regression tests + 2 BE unit tests. Missing negative test for repeated DELETE merging.
- Observability: Adequate logging in key paths.
- Transaction/persistence: Follows existing Iceberg transaction patterns.
- Data writes: Critical bug in delete merging causes data correctness issue.
- FE-BE variable passing: New thrift fields added in all required paths.
- Performance: No obvious issues; CRC32 computation acceptable for DV sizes.
| (key, value) -> { | ||
| TIcebergRewritableDeleteFileSet deleteFileSet = new TIcebergRewritableDeleteFileSet(); | ||
| deleteFileSet.setReferencedDataFilePath(key); | ||
| thriftDeleteFileSets.add(deleteFileSet); |
There was a problem hiding this comment.
[Critical Bug] setDeleteFiles(value) is never called on deleteFileSet. The lambda captures (key, value) from deleteFilesDescByReferencedDataFile but only uses key — value (the List<TIcebergDeleteFileDesc>) is silently discarded.
On the BE side (viceberg_delete_sink.cpp:174), the code skips entries where !delete_file_set.__isset.delete_files, so all rewritable entries are effectively no-ops. This means when a second DELETE hits a data file that already has deletion vectors, the previous deletes are not merged into the new DV, causing those previously deleted rows to reappear.
Fix:
icebergScanNode.deleteFilesDescByReferencedDataFile.forEach(
(key, value) -> {
TIcebergRewritableDeleteFileSet deleteFileSet = new TIcebergRewritableDeleteFileSet();
deleteFileSet.setReferencedDataFilePath(key);
deleteFileSet.setDeleteFiles(value); // <-- missing line
thriftDeleteFileSets.add(deleteFileSet);
}
);
be/src/exec/scan/file_scanner.cpp
Outdated
| _row_lineage_columns.row_id_column_idx = _default_val_row_desc->get_column_id(slot_id); | ||
| } | ||
|
|
||
| if (it->second->col_name().starts_with(IcebergTableReader::ROW_LINEAGE_LAST_UPDATED_SEQ_NUMBER)) { |
There was a problem hiding this comment.
[Minor] starts_with(IcebergTableReader::ROW_LINEAGE_ROW_ID) should be an exact match (==) instead. The constant is "_row_id", and starts_with would incorrectly match any column name that begins with _row_id (e.g., _row_id_backup). The adjacent code uses exact match for BeConsts::ICEBERG_ROWID_COL; this should be consistent.
Same applies to the starts_with(IcebergTableReader::ROW_LINEAGE_LAST_UPDATED_SEQ_NUMBER) check 4 lines below.
if (it->second->col_name() == IcebergTableReader::ROW_LINEAGE_ROW_ID) {
_row_lineage_columns.row_id_column_idx = _default_val_row_desc->get_column_id(slot_id);
}
if (it->second->col_name() == IcebergTableReader::ROW_LINEAGE_LAST_UPDATED_SEQ_NUMBER) {
_row_lineage_columns.last_updated_sequence_number_column_idx = ...
}|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 26996 ms |
TPC-DS: Total hot run time: 168453 ms |
80b69a3 to
89919c9
Compare
|
run buildall |
89919c9 to
85ccf8b
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
85ccf8b to
775cdf6
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 26857 ms |
TPC-DS: Total hot run time: 169177 ms |
|
run buildall |
TPC-H: Total hot run time: 26348 ms |
TPC-DS: Total hot run time: 168285 ms |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 26445 ms |
TPC-DS: Total hot run time: 168799 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)