branch-4.0: [fix](cloud) checkpoint save cloud tablet stats to image (#60705)#62514
branch-4.0: [fix](cloud) checkpoint save cloud tablet stats to image (#60705)#62514mymeiyi wants to merge 1 commit intoapache:branch-4.0from
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
There was a problem hiding this comment.
Pull request overview
This PR backports a cloud-mode fix to ensure checkpoint images remain fresh enough to carry up-to-date cloud tablet/replica statistics, and adds logic to post-process checkpoint metadata using the serving FE’s latest stats.
Changes:
- Track latest image file timestamp and trigger checkpoint in cloud mode when the image is older than a configurable threshold.
- Post-process checkpoint metadata in cloud mode to copy tablet/replica stats from the serving Env into the checkpoint Env before saving the image.
- Persist additional cloud/replica stat fields into the image via
@SerializedName, and initialize cloud table stats on FE start when empty.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| fe/fe-core/src/main/java/org/apache/doris/persist/Storage.java | Track latest image “create time” (via file mtime) to support staleness-based checkpoint triggering. |
| fe/fe-core/src/main/java/org/apache/doris/master/Checkpoint.java | Add cloud staleness trigger and post-process cloud metadata to copy stats into the checkpoint image. |
| fe/fe-core/src/main/java/org/apache/doris/cloud/catalog/CloudReplica.java | Persist cloud replica tablet-stat timing fields into the image using short serialized names. |
| fe/fe-core/src/main/java/org/apache/doris/catalog/Replica.java | Persist segmentCount / rowsetCount into the image via @SerializedName. |
| fe/fe-core/src/main/java/org/apache/doris/catalog/CloudTabletStatMgr.java | On startup, initialize table stats from image-loaded tablet stats when current stats list is empty. |
| fe/fe-common/src/main/java/org/apache/doris/common/Config.java | Add cloud checkpoint image staleness threshold config. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for (Partition partition : partitions) { | ||
| Partition servingPartition = servingOlapTable.getPartition(partition.getId()); | ||
| if (servingPartition == null) { | ||
| if (LOG.isDebugEnabled()) { | ||
| LOG.debug("serving partition is null. tableId: {}, partitionId: {}", table.getId(), | ||
| partition.getId()); | ||
| } | ||
| continue; | ||
| } | ||
| // set tablet stats | ||
| setTabletStats(table.getId(), partition, servingPartition); |
There was a problem hiding this comment.
postProcessCloudMetadata() traverses servingEnv metadata (tables/partitions/indexes/tablets) without taking the usual read locks. Partition’s index maps are plain HashMaps, so concurrent DDL/rollup updates on the serving catalog can lead to inconsistent reads or ConcurrentModificationException during checkpoint. Consider acquiring servingOlapTable.readLock() (and releasing in finally) while you resolve partitions/index/tablet/replica lookups (or otherwise snapshot the needed structures under lock).
| for (Partition partition : partitions) { | |
| Partition servingPartition = servingOlapTable.getPartition(partition.getId()); | |
| if (servingPartition == null) { | |
| if (LOG.isDebugEnabled()) { | |
| LOG.debug("serving partition is null. tableId: {}, partitionId: {}", table.getId(), | |
| partition.getId()); | |
| } | |
| continue; | |
| } | |
| // set tablet stats | |
| setTabletStats(table.getId(), partition, servingPartition); | |
| servingOlapTable.readLock(); | |
| try { | |
| for (Partition partition : partitions) { | |
| Partition servingPartition = servingOlapTable.getPartition(partition.getId()); | |
| if (servingPartition == null) { | |
| if (LOG.isDebugEnabled()) { | |
| LOG.debug("serving partition is null. tableId: {}, partitionId: {}", table.getId(), | |
| partition.getId()); | |
| } | |
| continue; | |
| } | |
| // set tablet stats | |
| setTabletStats(table.getId(), partition, servingPartition); | |
| } | |
| } finally { | |
| servingOlapTable.readUnlock(); |
| CloudReplica cloudReplica = (CloudReplica) replica; | ||
| CloudReplica servingCloudReplica = (CloudReplica) servingReplica; | ||
| cloudReplica.setStatsIntervalIndex(servingCloudReplica.getStatsIntervalIndex()); | ||
| cloudReplica.setLastGetTabletStatsTime(servingCloudReplica.getLastGetTabletStatsTime()); |
There was a problem hiding this comment.
setTabletStats() unconditionally casts both replica and servingReplica to CloudReplica. In cloud mode, the image loader can still deserialize replicas as base Replica (e.g., compatibility subtype LocalReplica or default subtype), so this can throw ClassCastException and fail the whole checkpoint. Please guard with instanceof (and only copy the cloud-specific fields when both sides are CloudReplica), so checkpoint remains best-effort for stats copying.
| CloudReplica cloudReplica = (CloudReplica) replica; | |
| CloudReplica servingCloudReplica = (CloudReplica) servingReplica; | |
| cloudReplica.setStatsIntervalIndex(servingCloudReplica.getStatsIntervalIndex()); | |
| cloudReplica.setLastGetTabletStatsTime(servingCloudReplica.getLastGetTabletStatsTime()); | |
| if (replica instanceof CloudReplica && servingReplica instanceof CloudReplica) { | |
| CloudReplica cloudReplica = (CloudReplica) replica; | |
| CloudReplica servingCloudReplica = (CloudReplica) servingReplica; | |
| cloudReplica.setStatsIntervalIndex(servingCloudReplica.getStatsIntervalIndex()); | |
| cloudReplica.setLastGetTabletStatsTime(servingCloudReplica.getLastGetTabletStatsTime()); | |
| } |
FE UT Coverage ReportIncrement line coverage |
pick #60705