E2E storage backend test by maheshrbapatu · Pull Request #1660 · kvcache-ai/Mooncake

maheshrbapatu · 2026-03-15T03:14:27Z

Description

Implemented a new storage_backend_e2e_test that validates the storage demotion pipeline using an in-process master (InProcMaster) instead of external master + etcd orchestration.

Key changes:

Added storage_backend_e2e_test target and test registration in CMakeLists.txt.
Added new E2E coverage in storage_backend_e2e_test.cpp for:
offload and readback
cross-client readback
disk-only read after eviction
disk replica survival after writer exit
graceful behavior under quota pressure
Extended client_wrapper.h and client_wrapper.cpp with replica inspection helpers (HasDiskReplica, HasMemoryReplica, etc.) and GetWithExpectedSize to support disk-only assertions.
Stabilized eviction coverage by enforcing valid mount sizing and asserting eviction behavior in a way that matches the implementation (at least one expired seed key evicted, then validated readable from disk).

Module

Type of Change

How Has This Been Tested?

This change itself is about e2e tests for testing storage backend.

Checklist

I have performed a self-review of my own code.
I have formatted my own code using ./scripts/code_format.sh before submitting.
I have updated the documentation.
I have added tests to prove my changes are effective.

gemini-code-assist · 2026-03-15T03:14:43Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness of the storage backend by introducing a comprehensive suite of end-to-end tests. These tests validate critical aspects of the storage demotion pipeline, ensuring data integrity and accessibility even when memory replicas are evicted or client processes fail. The new test framework utilizes an in-process master, simplifying test orchestration and improving reliability.

Highlights

New E2E Test Suite: Introduced a new end-to-end test suite, storage_backend_e2e_test, to validate the storage demotion pipeline using an in-process master, simplifying test orchestration and improving reliability.
Expanded Test Coverage: Expanded test coverage to include critical scenarios such as offload and readback, cross-client readback, disk-only reads after eviction, disk replica survival after writer exit, and graceful behavior under quota pressure.
Client Wrapper Enhancements: Enhanced client_wrapper with new helper functions (HasDiskReplica, HasMemoryReplica, GetWithExpectedSize) for comprehensive replica inspection and enabling disk-only assertions.
Eviction Test Stability: Improved eviction test stability by enforcing valid mount sizing and asserting eviction behavior more precisely to match the implementation.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

mooncake-store/tests/e2e/CMakeLists.txt
- Added storage_backend_e2e_test as a new executable target.
- Registered storage_backend_e2e_test as a CTest test.
mooncake-store/tests/e2e/client_wrapper.cpp
- Modified Client::Create call to pass std::nullopt for an optional parameter.
- Implemented HasDiskReplica to check for disk-only replicas.
- Implemented HasLocalDiskReplica to check for local disk replicas.
- Implemented HasMemoryReplica to check for memory replicas.
- Implemented GetWithExpectedSize to retrieve data from disk-only replicas.
mooncake-store/tests/e2e/client_wrapper.h
- Declared HasDiskReplica, HasLocalDiskReplica, and HasMemoryReplica methods.
- Declared GetWithExpectedSize method for reading disk-only replicas.
mooncake-store/tests/e2e/storage_backend_e2e_test.cpp
- Added a new E2E test file storage_backend_e2e_test.cpp.
- Defined StorageBackendE2ETest class inheriting from ::testing::Test.
- Included test cases for OffloadAndReadback, CrossClientReadback, DiskOnlyReadAfterEviction, DiskReplicaSurvivesWriterDeath, and QuotaExhaustionGracefulBehavior.

Activity

The pull request was created by maheshrbapatu.
No human activity (comments, reviews) has occurred yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a valuable end-to-end test suite for the storage backend, which is crucial for validating the storage demotion pipeline. The tests cover key scenarios including offload/readback, cross-client access, eviction, and fault tolerance. The implementation is solid, but I have a few suggestions to enhance code quality, correctness, and test robustness. Specifically, there's a bug in the client wrapper creation where a parameter is ignored, a test that could be made stricter, and opportunities to improve code consistency and reduce duplication.

gemini-code-assist · 2026-03-15T03:16:15Z

mooncake-store/tests/e2e/client_wrapper.cpp

+    auto client_opt = Client::Create(hostname, metadata_connstring, protocol,
+                                     std::nullopt, master_server_entry);


The device_name parameter is unused in CreateClientWrapper. It should be passed to Client::Create to properly configure the client with the specified device. The current implementation hardcodes std::nullopt, which ignores the device name provided by the caller in the new test.

Suggested change

auto client_opt = Client::Create(hostname, metadata_connstring, protocol,

std::nullopt, master_server_entry);

auto client_opt = Client::Create(hostname, metadata_connstring, protocol,

device_name, master_server_entry);

gemini-code-assist · 2026-03-15T03:16:15Z

mooncake-store/tests/e2e/storage_backend_e2e_test.cpp

+            auto err = reader->GetWithExpectedSize(keys[i], kValueSize, got);
+            if (err == ErrorCode::OK) {
+                EXPECT_EQ(got, expected_values[i])
+                    << "Data mismatch for " << keys[i];
+            } else {
+                LOG(WARNING)
+                    << "Key " << keys[i] << " not readable after writer "
+                    << "death: " << toString(err);
+            }


The test DiskReplicaSurvivesWriterDeath should fail if reading from the disk replica fails after the writer's death. Currently, it only logs a warning, which could mask a regression. It's better to use ASSERT_EQ to ensure the read operation is successful, making the test more robust.

std::string got; ASSERT_EQ(reader->GetWithExpectedSize(keys[i], kValueSize, got), ErrorCode::OK) << "Failed to read key " << keys[i] << " after writer death"; EXPECT_EQ(got, expected_values[i]) << "Data mismatch for " << keys[i];

gemini-code-assist · 2026-03-15T03:16:15Z

mooncake-store/tests/e2e/client_wrapper.cpp

+bool ClientTestWrapper::HasDiskReplica(const std::string& key) {
+    auto query_result = client_->Query(key);
+    if (!query_result.has_value()) return false;
+    for (const auto& replica : query_result.value().replicas) {
+        if (replica.is_disk_replica()) return true;
+    }
+    return false;
+}
+
+bool ClientTestWrapper::HasLocalDiskReplica(const std::string& key) {
+    auto query_result = client_->Query(key);
+    if (!query_result.has_value()) return false;
+    for (const auto& replica : query_result.value().replicas) {
+        if (replica.is_local_disk_replica()) return true;
+    }
+    return false;
+}
+
+bool ClientTestWrapper::HasMemoryReplica(const std::string& key) {
+    auto query_result = client_->Query(key);
+    if (!query_result.has_value()) return false;
+    for (const auto& replica : query_result.value().replicas) {
+        if (replica.is_memory_replica()) return true;
+    }
+    return false;
+}


These three functions (HasDiskReplica, HasLocalDiskReplica, HasMemoryReplica) have very similar implementations. You can reduce code duplication and improve readability by using std::any_of. This would make the code more concise and maintainable. Note that you might need to include <algorithm>.

bool ClientTestWrapper::HasDiskReplica(const std::string& key) { auto query_result = client_->Query(key); if (!query_result.has_value()) return false; const auto& replicas = query_result.value().replicas; return std::any_of(replicas.begin(), replicas.end(), [](const auto& r) { return r.is_disk_replica(); }); } bool ClientTestWrapper::HasLocalDiskReplica(const std::string& key) { auto query_result = client_->Query(key); if (!query_result.has_value()) return false; const auto& replicas = query_result.value().replicas; return std::any_of(replicas.begin(), replicas.end(), [](const auto& r) { return r.is_local_disk_replica(); }); } bool ClientTestWrapper::HasMemoryReplica(const std::string& key) { auto query_result = client_->Query(key); if (!query_result.has_value()) return false; const auto& replicas = query_result.value().replicas; return std::any_of(replicas.begin(), replicas.end(), [](const auto& r) { return r.is_memory_replica(); }); }

gemini-code-assist · 2026-03-15T03:16:15Z

mooncake-store/tests/e2e/storage_backend_e2e_test.cpp

+    }
+
+    // Wait for client TTL expiry so master cleans up the writer.
+    sleep(DEFAULT_CLIENT_LIVE_TTL_SEC + 5);


For consistency with the rest of the file, which uses std::this_thread::sleep_for, consider replacing the C-style sleep() call with its modern C++ equivalent. This improves code consistency and type safety.

Suggested change

sleep(DEFAULT_CLIENT_LIVE_TTL_SEC + 5);

std::this_thread::sleep_for(std::chrono::seconds(DEFAULT_CLIENT_LIVE_TTL_SEC + 5));

codecov-commenter · 2026-03-15T03:49:52Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 91.73913% with 19 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...ncake-store/tests/e2e/storage_backend_e2e_test.cpp	94.00%	12 Missing ⚠️
mooncake-store/tests/e2e/client_wrapper.cpp	76.66%	7 Missing ⚠️

📢 Thoughts on this report? Let us know!

ykwd · 2026-03-16T03:41:18Z

Thanks a lot. This is a very thorough test! We will run it locally to conduct more detailed testing of the storage backend.

e2e storage backend test

a237fea

maheshrbapatu requested review from XucSh, YiXR, stmatengss and ykwd as code owners March 15, 2026 03:14

github-actions bot added run-ci Store labels Mar 15, 2026

maheshrbapatu requested a review from zhangzuo21 March 15, 2026 03:15

gemini-code-assist bot reviewed Mar 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

E2E storage backend test#1660

E2E storage backend test#1660
maheshrbapatu wants to merge 1 commit intokvcache-ai:mainfrom
maheshrbapatu:e2e_storage_backend

maheshrbapatu commented Mar 15, 2026

Uh oh!

gemini-code-assist bot commented Mar 15, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 15, 2026

Uh oh!

gemini-code-assist bot Mar 15, 2026

Uh oh!

gemini-code-assist bot Mar 15, 2026

Uh oh!

gemini-code-assist bot Mar 15, 2026

Uh oh!

codecov-commenter commented Mar 15, 2026

Uh oh!

ykwd commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		auto client_opt = Client::Create(hostname, metadata_connstring, protocol,
		std::nullopt, master_server_entry);

	sleep(DEFAULT_CLIENT_LIVE_TTL_SEC + 5);
	std::this_thread::sleep_for(std::chrono::seconds(DEFAULT_CLIENT_LIVE_TTL_SEC + 5));

Conversation

maheshrbapatu commented Mar 15, 2026

Description

Module

Type of Change

How Has This Been Tested?

Checklist

Uh oh!

gemini-code-assist bot commented Mar 15, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 15, 2026

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Mar 15, 2026

Codecov Report

Uh oh!

ykwd commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants