Skip to content

[Store] Add UpsertStart/End/Revoke for in-place object updates#1662

Open
00fish0 wants to merge 6 commits intokvcache-ai:mainfrom
00fish0:upsert-for-store
Open

[Store] Add UpsertStart/End/Revoke for in-place object updates#1662
00fish0 wants to merge 6 commits intokvcache-ai:mainfrom
00fish0:upsert-for-store

Conversation

@00fish0
Copy link
Collaborator

@00fish0 00fish0 commented Mar 15, 2026

RFC: #1645

Implement upsert operations in MasterService that allow updating existing keys without the overhead of Remove + Put. Three paths:

  • Case A: Key doesn't exist → equivalent to PutStart
  • Case B: Key exists, same size → in-place update (reuse buffers)
  • Case C: Key exists, different size → release old + allocate new

Safety checks prevent breaking the immutability invariant:

  • Reject if key has active replication/offloading tasks
  • Reject if any replica has non-zero refcnt (OBJECT_REPLICA_BUSY)
  • Preempt in-progress Put/Upsert on the same key

Description

Module

  • Transfer Engine (mooncake-transfer-engine)
  • Mooncake Store (mooncake-store)
  • Mooncake EP (mooncake-ep)
  • Integration (mooncake-integration)
  • P2P Store (mooncake-p2p-store)
  • Python Wheel (mooncake-wheel)
  • PyTorch Backend (mooncake-pg)
  • Mooncake RL (mooncake-rl)
  • CI/CD
  • Docs
  • Other

Type of Change

  • Bug fix
  • New feature
  • Refactor
  • Breaking change
  • Documentation update
  • Other

How Has This Been Tested?

Checklist

  • I have performed a self-review of my own code.
  • I have formatted my own code using ./scripts/code_format.sh before submitting.
  • I have updated the documentation.
  • I have added tests to prove my changes are effective.

Implement upsert operations in MasterService that allow updating
existing keys without the overhead of Remove + Put. Three paths:
- Case A: Key doesn't exist → equivalent to PutStart
- Case B: Key exists, same size → in-place update (reuse buffers)
- Case C: Key exists, different size → release old + allocate new

Safety checks prevent breaking the immutability invariant:
- Reject if key has active replication/offloading tasks
- Reject if any replica has non-zero refcnt (OBJECT_REPLICA_BUSY)
- Preempt in-progress Put/Upsert on the same key
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the MasterService by introducing a robust "upsert" functionality. This feature allows for more efficient and atomic updates to data objects, eliminating the need for separate delete and create operations. It intelligently handles different update scenarios, such as in-place modifications for same-sized objects or reallocations for size changes, while ensuring data consistency and preventing conflicts with other ongoing operations.

Highlights

  • New Upsert Operations: Introduced UpsertStart, UpsertEnd, and UpsertRevoke operations in MasterService for efficient in-place object updates, along with their batch counterparts.
  • Intelligent Update Logic: Implemented logic to handle three cases for upsert: creating a new key (Case A), performing an in-place update for existing keys of the same size (Case B), and reallocating resources for existing keys with different sizes (Case C).
  • Safety and Preemption: Added safety checks to prevent upsert operations from conflicting with active replication/offloading tasks or busy replicas, and incorporated preemption logic for in-progress Put/Upsert operations on the same key.
  • Metadata Flexibility: Modified ObjectMetadata to allow client_id and put_start_time to be updated, enabling metadata refresh during in-place upserts.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • mooncake-store/include/master_service.h
    • Declared new UpsertStart, UpsertEnd, UpsertRevoke methods and their batch variants.
    • Removed the const qualifier from client_id and put_start_time members within ObjectMetadata.
  • mooncake-store/include/replica.h
    • Introduced an is_busy() method to check if a replica has a non-zero reference count.
    • Added a mark_processing() method to transition a COMPLETE replica to PROCESSING status.
  • mooncake-store/include/types.h
    • Defined a new ErrorCode::OBJECT_REPLICA_BUSY to indicate that an object's replicas have active references.
  • mooncake-store/src/master_service.cpp
    • Implemented the core logic for UpsertStart, handling new keys, same-size updates, and different-size updates.
    • Incorporated safety checks to reject upserts if replication/offloading tasks are active or if replicas are busy.
    • Added preemption logic for conflicting Put/Upsert operations.
    • Implemented UpsertEnd and UpsertRevoke by delegating to their Put counterparts.
    • Implemented BatchUpsertStart, BatchUpsertEnd, and BatchUpsertRevoke.
  • mooncake-store/src/types.cpp
    • Updated the toString mapping for ErrorCode to include OBJECT_REPLICA_BUSY.
  • mooncake-store/tests/master_service_test.cpp
    • Added comprehensive unit tests for UpsertNewKey, UpsertSameSize, UpsertSameSizeRefreshesMetadata, UpsertDifferentSize, UpsertConflictReplicationTask, UpsertPreemptsInProgressPut, UpsertRevoke, UpsertInPlaceThenRevoke, and BatchUpsertStart.
Activity
  • No specific activity (comments, reviews, progress updates) has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable Upsert feature, allowing for more efficient in-place updates of objects. The implementation correctly handles the different cases for existing and new keys, and includes important safety checks. The addition of comprehensive unit tests is also a great contribution, ensuring the new functionality is robust.

My main feedback is on code maintainability. I've identified a significant amount of duplicated code for replica allocation within the new UpsertStart function, which is also shared with the existing PutStart function. I've suggested refactoring this into a private helper method to improve code clarity and reduce redundancy.

@00fish0 00fish0 marked this pull request as ready for review March 15, 2026 06:46
Copilot AI review requested due to automatic review settings March 15, 2026 06:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an Upsert API path to Mooncake Store’s master/client stack to support efficient in-place updates (same-size overwrite) or delete-and-reallocate updates (size change), avoiding the overhead of Remove + Put while enforcing safety gates (replication/offload tasks, busy replicas, preemption of in-progress writes).

Changes:

  • Introduces UpsertStart/UpsertEnd/UpsertRevoke plus batch variants across MasterService, RPC wrappers, and MasterClient.
  • Extends client-side support (Client, RealClient, DummyClient) and Python bindings for upsert operations.
  • Adds OBJECT_REPLICA_BUSY error code + string mapping and adds MasterService unit tests for upsert flows.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
mooncake-store/src/master_service.cpp Implements UpsertStart logic (case A/B/C, safety checks, preemption) and batch upsert entry points.
mooncake-store/include/master_service.h Declares upsert APIs; makes ObjectMetadata::client_id / put_start_time mutable for in-place updates.
mooncake-store/src/rpc_service.cpp Exposes upsert RPC handlers in WrappedMasterService and registers them.
mooncake-store/include/rpc_service.h Declares upsert RPC endpoints.
mooncake-store/src/master_client.cpp Adds MasterClient RPC invocations for upsert + batch upsert.
mooncake-store/include/master_client.h Declares MasterClient upsert APIs.
mooncake-store/src/client_service.cpp Adds Client::Upsert and Client::BatchUpsert (loop-based).
mooncake-store/include/client_service.h Declares Client upsert APIs.
mooncake-store/src/real_client.cpp Adds RealClient upsert variants (upsert, upsert_from, batch_upsert_from, upsert_parts) and dummy helpers.
mooncake-store/include/real_client.h Declares RealClient upsert methods + internal helpers.
mooncake-store/src/dummy_client.cpp Adds dummy-client upsert entry points (one method left unimplemented).
mooncake-store/include/dummy_client.h Declares dummy-client upsert methods.
mooncake-store/include/pyclient.h Extends PyClient interface with upsert methods.
mooncake-store/include/types.h Adds ErrorCode::OBJECT_REPLICA_BUSY.
mooncake-store/src/types.cpp Adds OBJECT_REPLICA_BUSY to toString().
mooncake-store/include/replica.h Adds Replica::is_busy() and mark_processing() helpers for upsert safety and in-place transitions.
mooncake-store/tests/master_service_test.cpp Adds unit tests for upsert scenarios (new key, same size, size change, conflicts, preemption, revoke, batch).
mooncake-integration/store/store_py.cpp Adds Python bindings for upsert (raw bytes + tensor helpers).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// Safety check: reject if offloading in progress
if (shard->offloading_tasks.count(key) > 0) {
LOG(INFO) << "key=" << key
<< ", error=object_has_offloading_task";
Comment on lines +1189 to +1200
std::vector<tl::expected<std::vector<Replica::Descriptor>, ErrorCode>>
MasterService::BatchUpsertStart(
const UUID& client_id, const std::vector<std::string>& keys,
const std::vector<uint64_t>& slice_lengths,
const ReplicateConfig& config) {
std::vector<tl::expected<std::vector<Replica::Descriptor>, ErrorCode>>
results;
results.reserve(keys.size());
for (size_t i = 0; i < keys.size(); ++i) {
results.emplace_back(
UpsertStart(client_id, keys[i], slice_lengths[i], config));
}
if (!client_) {
LOG(ERROR) << "Client is not initialized";
return tl::unexpected(ErrorCode::INVALID_PARAMS);
}
Comment on lines +400 to +402
const ReplicateConfig& config) {
// TODO: implement this function
return -1;
Comment on lines +1709 to +1712
return self.store_->upsert(
key,
std::span<const char>(static_cast<char *>(info.ptr),
static_cast<size_t>(info.size)),
@codecov-commenter
Copy link

codecov-commenter commented Mar 15, 2026

@00fish0 00fish0 requested a review from ShangmingCai as a code owner March 16, 2026 07:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants