[Store] Add UpsertStart/End/Revoke for in-place object updates#1662
[Store] Add UpsertStart/End/Revoke for in-place object updates#166200fish0 wants to merge 6 commits intokvcache-ai:mainfrom
Conversation
Implement upsert operations in MasterService that allow updating existing keys without the overhead of Remove + Put. Three paths: - Case A: Key doesn't exist → equivalent to PutStart - Case B: Key exists, same size → in-place update (reuse buffers) - Case C: Key exists, different size → release old + allocate new Safety checks prevent breaking the immutability invariant: - Reject if key has active replication/offloading tasks - Reject if any replica has non-zero refcnt (OBJECT_REPLICA_BUSY) - Preempt in-progress Put/Upsert on the same key
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a valuable Upsert feature, allowing for more efficient in-place updates of objects. The implementation correctly handles the different cases for existing and new keys, and includes important safety checks. The addition of comprehensive unit tests is also a great contribution, ensuring the new functionality is robust.
My main feedback is on code maintainability. I've identified a significant amount of duplicated code for replica allocation within the new UpsertStart function, which is also shared with the existing PutStart function. I've suggested refactoring this into a private helper method to improve code clarity and reduce redundancy.
There was a problem hiding this comment.
Pull request overview
Adds an Upsert API path to Mooncake Store’s master/client stack to support efficient in-place updates (same-size overwrite) or delete-and-reallocate updates (size change), avoiding the overhead of Remove + Put while enforcing safety gates (replication/offload tasks, busy replicas, preemption of in-progress writes).
Changes:
- Introduces
UpsertStart/UpsertEnd/UpsertRevokeplus batch variants acrossMasterService, RPC wrappers, andMasterClient. - Extends client-side support (
Client,RealClient,DummyClient) and Python bindings for upsert operations. - Adds
OBJECT_REPLICA_BUSYerror code + string mapping and adds MasterService unit tests for upsert flows.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| mooncake-store/src/master_service.cpp | Implements UpsertStart logic (case A/B/C, safety checks, preemption) and batch upsert entry points. |
| mooncake-store/include/master_service.h | Declares upsert APIs; makes ObjectMetadata::client_id / put_start_time mutable for in-place updates. |
| mooncake-store/src/rpc_service.cpp | Exposes upsert RPC handlers in WrappedMasterService and registers them. |
| mooncake-store/include/rpc_service.h | Declares upsert RPC endpoints. |
| mooncake-store/src/master_client.cpp | Adds MasterClient RPC invocations for upsert + batch upsert. |
| mooncake-store/include/master_client.h | Declares MasterClient upsert APIs. |
| mooncake-store/src/client_service.cpp | Adds Client::Upsert and Client::BatchUpsert (loop-based). |
| mooncake-store/include/client_service.h | Declares Client upsert APIs. |
| mooncake-store/src/real_client.cpp | Adds RealClient upsert variants (upsert, upsert_from, batch_upsert_from, upsert_parts) and dummy helpers. |
| mooncake-store/include/real_client.h | Declares RealClient upsert methods + internal helpers. |
| mooncake-store/src/dummy_client.cpp | Adds dummy-client upsert entry points (one method left unimplemented). |
| mooncake-store/include/dummy_client.h | Declares dummy-client upsert methods. |
| mooncake-store/include/pyclient.h | Extends PyClient interface with upsert methods. |
| mooncake-store/include/types.h | Adds ErrorCode::OBJECT_REPLICA_BUSY. |
| mooncake-store/src/types.cpp | Adds OBJECT_REPLICA_BUSY to toString(). |
| mooncake-store/include/replica.h | Adds Replica::is_busy() and mark_processing() helpers for upsert safety and in-place transitions. |
| mooncake-store/tests/master_service_test.cpp | Adds unit tests for upsert scenarios (new key, same size, size change, conflicts, preemption, revoke, batch). |
| mooncake-integration/store/store_py.cpp | Adds Python bindings for upsert (raw bytes + tensor helpers). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Safety check: reject if offloading in progress | ||
| if (shard->offloading_tasks.count(key) > 0) { | ||
| LOG(INFO) << "key=" << key | ||
| << ", error=object_has_offloading_task"; |
| std::vector<tl::expected<std::vector<Replica::Descriptor>, ErrorCode>> | ||
| MasterService::BatchUpsertStart( | ||
| const UUID& client_id, const std::vector<std::string>& keys, | ||
| const std::vector<uint64_t>& slice_lengths, | ||
| const ReplicateConfig& config) { | ||
| std::vector<tl::expected<std::vector<Replica::Descriptor>, ErrorCode>> | ||
| results; | ||
| results.reserve(keys.size()); | ||
| for (size_t i = 0; i < keys.size(); ++i) { | ||
| results.emplace_back( | ||
| UpsertStart(client_id, keys[i], slice_lengths[i], config)); | ||
| } |
| if (!client_) { | ||
| LOG(ERROR) << "Client is not initialized"; | ||
| return tl::unexpected(ErrorCode::INVALID_PARAMS); | ||
| } |
| const ReplicateConfig& config) { | ||
| // TODO: implement this function | ||
| return -1; |
| return self.store_->upsert( | ||
| key, | ||
| std::span<const char>(static_cast<char *>(info.ptr), | ||
| static_cast<size_t>(info.size)), |
|
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
RFC: #1645
Implement upsert operations in MasterService that allow updating existing keys without the overhead of Remove + Put. Three paths:
Safety checks prevent breaking the immutability invariant:
Description
Module
mooncake-transfer-engine)mooncake-store)mooncake-ep)mooncake-integration)mooncake-p2p-store)mooncake-wheel)mooncake-pg)mooncake-rl)Type of Change
How Has This Been Tested?
Checklist
./scripts/code_format.shbefore submitting.