Skip to content

[Enhancement] Add StarOSAgent PACK shard group support and ColocateRangeUtils for range colocate#71221

Open
xiangguangyxg wants to merge 1 commit intoStarRocks:mainfrom
xiangguangyxg:colocate_shard_group
Open

[Enhancement] Add StarOSAgent PACK shard group support and ColocateRangeUtils for range colocate#71221
xiangguangyxg wants to merge 1 commit intoStarRocks:mainfrom
xiangguangyxg:colocate_shard_group

Conversation

@xiangguangyxg
Copy link
Copy Markdown
Contributor

@xiangguangyxg xiangguangyxg commented Apr 2, 2026

Why I'm doing:

Range distribution colocate requires PACK shard groups (for co-location) and the ability
to create shards belonging to multiple shard groups simultaneously (SPREAD + PACK).
The existing StarOSAgent only supports SPREAD placement policy and single group ID.
Additionally, a utility is needed to expand colocate column ranges to full sort key ranges
for tablet range assignment.

What I'm doing:

  1. Add StarOSAgent.createShardGroup(PlacementPolicy) overload to support creating
    shard groups with configurable placement policy (SPREAD or PACK). The existing
    method delegates to the new one with SPREAD as default.

  2. Add StarOSAgent.createShards(List<Long> groupIds) overload to support creating
    shards belonging to multiple shard groups simultaneously. The existing single-groupId
    method delegates to the new one with List.of(groupId).

  3. Add ColocateRangeUtils helper class with expandToFullSortKey() method that expands
    a colocate range (on colocate column prefix) to a full sort key range. The sentinel
    choice (MIN or MAX) depends on bound inclusiveness to preserve semantic equivalence:

    • Inclusive lower / exclusive upper → append MIN
    • Exclusive lower / inclusive upper → append MAX

Fixes #64986

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
    • This pr needs auto generate documentation
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 4.1
    • 4.0
    • 3.5
    • 3.4

@xiangguangyxg xiangguangyxg requested a review from a team as a code owner April 2, 2026 09:58
@github-actions github-actions bot added the 4.1 label Apr 2, 2026
@CelerData-Reviewer
Copy link
Copy Markdown

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6a7ee9a709

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…ngeUtils for range colocate

Signed-off-by: xiangguangyxg <[email protected]>
@xiangguangyxg xiangguangyxg force-pushed the colocate_shard_group branch from 6a7ee9a to cc7fc57 Compare April 2, 2026 10:39
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

[FE Incremental Coverage Report]

pass : 23 / 24 (95.83%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/catalog/ColocateRangeUtils.java 17 18 94.44% [26]
🔵 com/starrocks/lake/StarOSAgent.java 6 6 100.00% []

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

[BE Incremental Coverage Report]

pass : 0 / 0 (0%)

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud bot commented Apr 2, 2026

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances FE support for range distribution colocate by enabling StarOS shard group creation with configurable placement policy (SPREAD/PACK), allowing shard creation into multiple shard groups, and adding a utility to expand colocate-prefix ranges into full sort-key ranges for tablet range assignment.

Changes:

  • Add StarOSAgent.createShardGroup(..., PlacementPolicy) overload and keep the existing method as a SPREAD default.
  • Add StarOSAgent.createShards(..., List<Long> groupIds, ...) overload to create shards belonging to multiple shard groups.
  • Introduce ColocateRangeUtils.expandToFullSortKey() plus a dedicated JUnit test suite.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
fe/fe-core/src/main/java/com/starrocks/lake/StarOSAgent.java Adds overloads for shard-group placement policy and multi-group shard creation.
fe/fe-core/src/main/java/com/starrocks/catalog/ColocateRangeUtils.java New helper to expand colocate prefix ranges into full sort-key ranges via MIN/MAX sentinels.
fe/fe-core/src/test/java/com/starrocks/catalog/ColocateRangeUtilsTest.java Adds unit tests covering inclusive/exclusive and bounded/unbounded expansion cases.

Comment on lines +499 to 507
public long createShardGroup(long dbId, long tableId, long partitionId, long indexId,
PlacementPolicy placementPolicy) throws DdlException {
prepare();
List<ShardGroupInfo> shardGroupInfos = null;
try {
List<CreateShardGroupInfo> createShardGroupInfos = new ArrayList<>();
createShardGroupInfos.add(CreateShardGroupInfo.newBuilder()
.setPolicy(PlacementPolicy.SPREAD)
.setPolicy(placementPolicy)
.putLabels("dbId", String.valueOf(dbId))
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createShardGroup(..., PlacementPolicy placementPolicy) passes placementPolicy directly into the proto builder without validation. If a caller accidentally passes null, this will fail with a NullPointerException from generated code rather than a clear DdlException/IllegalArgumentException. Consider adding an explicit null check (e.g., Preconditions/Objects.requireNonNull) and a helpful message before building the request.

Copilot uses AI. Check for mistakes.
Comment on lines 580 to +602
@@ -579,11 +593,13 @@ public List<Long> createShards(int numShards, FilePathInfo pathInfo, FileCacheIn

CreateShardInfo.Builder builder = CreateShardInfo.newBuilder();
builder.setReplicaCount(1)
.addGroupIds(groupId)
.setPathInfo(pathInfo)
.setCacheInfo(cacheInfo)
.putAllShardProperties(properties)
.setScheduleToWorkerGroup(workerGroupId);
for (long groupId : groupIds) {
builder.addGroupIds(groupId);
}
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new createShards(..., List<Long> groupIds, ...) overload iterates groupIds without validating it is non-null/non-empty. An empty list would create shards with no shard-group membership (likely invalid in StarOS) and a null list would NPE. Add argument validation (non-null, non-empty, and optionally reject null/duplicate IDs) before populating CreateShardInfo.

Copilot uses AI. Check for mistakes.
Comment on lines +54 to +66
public static Range<Tuple> expandToFullSortKey(Range<Tuple> colocateRange,
List<Column> sortKeyColumns,
int colocateColumnCount) {
if (colocateRange.isAll()) {
return Range.all();
}
int remainingColumns = sortKeyColumns.size() - colocateColumnCount;
// Inclusive lower → extend with MIN; exclusive lower → extend with MAX
Tuple lowerBound = colocateRange.isMinimum() ? null
: extendTuple(colocateRange.getLowerBound(), sortKeyColumns,
colocateColumnCount, remainingColumns,
colocateRange.isLowerBoundIncluded());
// Exclusive upper → extend with MIN; inclusive upper → extend with MAX
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expandToFullSortKey does not validate colocateColumnCount relative to sortKeyColumns.size(). If colocateColumnCount is negative or greater than the sort-key length, remainingColumns becomes invalid and the method silently returns an incorrectly-shaped range. Add argument checks (0 <= colocateColumnCount <= sortKeyColumns.size()) and consider validating the provided bound tuple lengths match colocateColumnCount when the corresponding side is bounded.

Copilot uses AI. Check for mistakes.
Comment on lines +165 to +167
colocateRange, SORT_KEY_COLUMNS, 3);

// No expansion, tuple size unchanged
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testExpandNoRemainingColumns passes colocateColumnCount = 3 (equal to the full sort-key length) but constructs bounds with makeTuple(...) which only has 1 value. The assertion then expects the expanded tuples to stay size 1, which contradicts the method’s contract of producing a full sort-key range and can mask tuple-length bugs. Consider either (a) building a 3-column tuple in the test and asserting size 3, or (b) changing the test to use colocateColumnCount = 1 if the intent is to verify the "no remaining columns" path for a 1-column sort key.

Suggested change
colocateRange, SORT_KEY_COLUMNS, 3);
// No expansion, tuple size unchanged
colocateRange, SORT_KEY_COLUMNS.subList(0, 1), 1);
// No expansion, tuple size unchanged for a single-column sort key

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

New multi-tenant data management

3 participants