CurvineIO · lzjqsdd · Apr 9, 2026 · Apr 9, 2026
diff --git a/blog/2026-04-06-curvine-metadata-benchmark/connection-overhead.png b/blog/2026-04-06-curvine-metadata-benchmark/connection-overhead.png
diff --git a/blog/2026-04-06-curvine-metadata-benchmark/index.md b/blog/2026-04-06-curvine-metadata-benchmark/index.md
@@ -0,0 +1,105 @@
+# Curvine Benchmark: 300 Million Files in Just 38 GB of Memory
+
+*Translated from the original Chinese article published on April 6, 2026.*
+
+In distributed file systems, metadata memory efficiency, concurrent request handling, and small-file throughput are core indicators of overall system capability. Curvine recently completed a high-intensity metadata benchmark, and the results were clear: Curvine reached a new high-water mark for open-source metadata efficiency while delivering core capabilities comparable to commercial distributed storage products.
+
+### 🔥 Key Takeaways
+
+- **Efficient memory usage**: With **800,000 directories** and **300 million files**, and one block written per file, Curvine used just **38 GB** of memory. That is roughly on par with the metadata-memory capability described for the commercial edition of JuiceFS in reference [1].
+- **Low latency under massive concurrency**: With **100,000 clients** looping operations, throughput held steady at **53,000 ops/s**. Average command latency stayed **below 2 ms**, and **P99 latency stayed below 9 ms**.
+- **High small-file throughput**: Under heavy concurrent small-file writes, Curvine sustained **12 million small files per hour**, with an average write time of **0.3 ms per file**.
+
+## 📝 Test Setup
+
+- **Curvine cluster**: one Master and one Worker
+- **Benchmark machine**: Alibaba Cloud `ecs.i5.8xlarge`, 32 cores, 256 GB RAM
+- **Clients**: 100,000 FUSE clients
+- **Operations**: repeated high-frequency commands such as `mkdir`, `touch`, file writes, and `ls`
+
+## 📊 Core Benchmark Results
+
+### 🧠 Memory Efficiency: A New Open-Source High-Water Mark
+
+- Managed scale: **800,000 directories + 300 million files**
+- Per-file data written: **1 block**
+- Total memory usage: **38 GB**
+- Comparison point: comparable to the metadata-memory capability described for the commercial edition of JuiceFS
+
+![Memory efficiency benchmark](./memory-efficiency.png)
+
+### ⏱️ High Concurrency, Low Latency at 100,000 Clients
+
+- Concurrent clients: **100,000 FUSE clients**
+- Stable throughput: **53,000 ops/s**
+- Average latency: **up to 2 ms**
+- P99 latency: **up to 9 ms**
+
+![QPS under concurrency](./qps.png)
+
+![Latency under concurrency](./latency.png)
+
+Connection overhead was also low: **100,000 live connections consumed only 1.1 GB**, or about **11.5 KB per connection**.
+
+![Connection overhead](./connection-overhead.png)
+
+Once the benchmark stopped, Master memory dropped immediately from **39.1 GB** back to **38 GB**.
+
+![Master memory after benchmark stop](./master-memory-recovery.png)
+
+### 🚀 Small-File Throughput: Built for Scale
+
+- Files written per hour: **12 million small files**
+- Average write time per file: **0.3 ms**
+- Throughput remained saturated even under high concurrency
+
+At **15:00**, Curvine had written **287 million files**:
+
+![Small-file count at 15:00](./small-file-count-1500.png)
+
+At **16:00**, the total had reached **299 million files**:
+
+![Small-file count at 16:00](./small-file-count-1600.png)
+
+## 🏗️ Metadata Architecture
+
+Curvine's metadata subsystem stands out not just in large-scale memory efficiency and high-concurrency performance, but also in comparison with other open-source systems. Those results come from a deliberately designed metadata architecture.
+
+![Curvine metadata architecture overview](./metadata-architecture.png)
+
+### 💡 Design Principles
+
+1. A single Master should support very large namespaces and massive numbers of small files.
+2. The system should provide high concurrency and low latency for frequent operations such as create, delete, and update.
+3. External dependencies should be minimized to reduce operational complexity while keeping the system stable.
+
+Based on those goals, Curvine combines an **in-memory directory tree**, **standalone RocksDB**, and a **Raft-based consistency mechanism**. This three-layer design balances performance, scale, and stability.
+
+| Layer | Core Responsibility | Why It Exists |
+| --- | --- | --- |
+| In-memory directory tree | Stores directory structure metadata such as directory names and parent-child relationships; handles path resolution, directory listing, and other high-frequency namespace operations | Keeps the hottest namespace operations in memory so directory lookups and path matching stay in the microsecond range; stores only lightweight directory structure to maximize scale |
+| Metadata RocksDB (`inode` engine) | Persists complete file and directory metadata, including file size, permissions, `mtime`, block locations, and full directory relationships | Uses column families to separate different metadata types, improving read/write efficiency and making frequent metadata updates easier to manage |
+| Raft log RocksDB | Persists the log of all metadata mutations, including create, delete, and update operations, in order for node-to-node synchronization | Separates log storage from metadata storage so replication, compaction, cleanup, and recovery do not interfere with metadata reads and writes |
+
+### 🛡️ FsMode: Working with UFS for Safe Durability
+
+Curvine also supports **FsMode**, which synchronizes metadata and file data to the underlying file system (UFS). This creates a dual safety model of **local storage plus disk-backed fallback**, preventing data loss without sacrificing runtime performance.
+
+## 🚀 Future Directions
+
+Curvine's metadata system will keep pushing forward in three areas:
+
+1. **10 billion files on a single node**: continue deepening single-node capability until a standard **512 GB** memory machine can manage metadata for **10 billion files**.
+2. **Federation**: improve cluster-scale metadata expansion with an HDFS Federation-like model that partitions by directory and can scale beyond **100 billion files**. Federation is especially strong for centralized metadata operations such as `mv` and `ls`, but it requires directory planning up front.
+3. **Pluggable metadata management**: abstract the metadata interface and support pluggable metadata backends for better flexibility and adaptability.
+
+## 📚 References
+
+1. https://mp.weixin.qq.com/s/zbBUQ4P53PPWQjOHQmw8uw
+2. https://hadoop.apache.org/docs/r3.4.0/hadoop-project-dist/hadoop-hdfs-rbf/HDFS%20RouterFederation.html
+
+### 👇 Follow Us
+
+We regularly share hands-on work on distributed storage, metadata optimization, and high-concurrency benchmarking.
+
+GitHub: https://github.com/CurvineIO/curvine
diff --git a/blog/2026-04-06-curvine-metadata-benchmark/latency.png b/blog/2026-04-06-curvine-metadata-benchmark/latency.png
diff --git a/blog/2026-04-06-curvine-metadata-benchmark/master-memory-recovery.png b/blog/2026-04-06-curvine-metadata-benchmark/master-memory-recovery.png
diff --git a/blog/2026-04-06-curvine-metadata-benchmark/memory-efficiency.png b/blog/2026-04-06-curvine-metadata-benchmark/memory-efficiency.png
diff --git a/blog/2026-04-06-curvine-metadata-benchmark/metadata-architecture.png b/blog/2026-04-06-curvine-metadata-benchmark/metadata-architecture.png
diff --git a/blog/2026-04-06-curvine-metadata-benchmark/qps.png b/blog/2026-04-06-curvine-metadata-benchmark/qps.png
diff --git a/blog/2026-04-06-curvine-metadata-benchmark/small-file-count-1500.png b/blog/2026-04-06-curvine-metadata-benchmark/small-file-count-1500.png
diff --git a/blog/2026-04-06-curvine-metadata-benchmark/small-file-count-1600.png b/blog/2026-04-06-curvine-metadata-benchmark/small-file-count-1600.png
diff --git a/...ugin-content-blog/2026-04-06-curvine-metadata-benchmark/connection-overhead.png b/...ugin-content-blog/2026-04-06-curvine-metadata-benchmark/connection-overhead.png
diff --git a/...n/docusaurus-plugin-content-blog/2026-04-06-curvine-metadata-benchmark/index.md b/...n/docusaurus-plugin-content-blog/2026-04-06-curvine-metadata-benchmark/index.md
@@ -0,0 +1,103 @@
+# Curvine 压测：3 亿文件仅占 38G 内存，开源项目天花板
+
+在分布式文件系统领域，元数据的内存效率、并发处理能力、小文件吞吐性能，一直是衡量产品核心能力的关键指标。近期，Curvine 完成了一组高规格元数据压测，结果显示：Curvine 的元数据内存效率达到了开源项目中的顶尖水平，核心能力可与商业版分布式存储产品相当。
+
+### 🔥 开篇结论
+
+- **内存高效利用**：在 **80 万目录**、**3 亿文件**、每个文件写入一个 block 的条件下，Curvine 仅占用 **38G** 内存，与参考材料 [1] 中 JuiceFS 商业版的元数据能力大致相当。
+- **高并发低延迟**：在 **10 万客户端** 循环操作的压力下，QPS 稳定在 **5.3 万每秒**，命令操作**平均时延低于 2ms**，**P99 时延低于 9ms**。
+- **小文件高吞吐**：高并发写入大量小文件时，Curvine 可实现**每小时写入 1200 万小文件**，平均写入一个小文件仅需 **0.3ms**。
+
+## 📝 测试条件
+
+- **Curvine 集群**：一台 Master，一台 Worker
+- **测试机型**：阿里云 `ecs.i5.8xlarge`，32 核，256G 内存
+- **客户端**：10 万个 FUSE 客户端
+- **操作**：客户端循环执行 `mkdir`、`touch`、写文件、`ls` 等高频命令
+
+## 📊 核心压测数据
+
+### 🧠 内存效率：开源第一梯队
+
+- 管理规模：**80 万目录 + 3 亿文件**
+- 单文件写入：**1 个 block**
+- 内存占用：**仅 38G**
+- 对标结论：与 JuiceFS 商业版的元数据内存能力相当
+
+![内存效率压测](./memory-efficiency.png)
+
+### ⏱️ 高并发低延迟：10 万客户端快跑稳跑
+
+- 并发客户端：**10 万 FUSE 客户端**
+- 稳定吞吐：**5.3 万次/秒**
+- 平均时延：**不超过 2ms**
+- P99 时延：**不超过 9ms**
+
+![高并发 QPS](./qps.png)
+
+![高并发时延](./latency.png)
+
+连接开销同样很低：**10 万连接仅消耗 1.1G 内存**，平均每个连接约 **11.5KB**。
+
+![连接开销](./connection-overhead.png)
+
+压测停止后，Master 内存会立刻从 **39.1G** 回落到 **38G**。
+
+![压测停止后的 Master 内存](./master-memory-recovery.png)
+
+### 🚀 小文件高吞吐：海量场景无压力
+
+- 每小时写入：**1200 万小文件**
+- 单文件平均写入时延：**0.3ms**
+- 高并发下吞吐持续打满
+
+**15:00** 时，Curvine 已写入 **2.87 亿文件**：
+
+![15 点文件总量](./small-file-count-1500.png)
+
+**16:00** 时，文件总量达到 **2.99 亿**：
+
+![16 点文件总量](./small-file-count-1600.png)
+
+## 🏗️ 元数据架构
+
+Curvine 的元数据能力不仅在大规模内存效率和高并发性能上表现突出，与其他开源产品相比也具备明显优势。其背后是一套经过精心设计的元数据架构。
+
+![Curvine 元数据架构概览](./metadata-architecture.png)
+
+### 💡 设计理念
+
+1. 单 Master 支撑大规模文件与海量小文件。
+2. 以高并发、低延迟应对频繁的创建、删除、修改等高频元数据操作。
+3. 尽量减少对外部组件的依赖，降低运维复杂度，同时保证系统稳定性。
+
+基于这些目标，Curvine 选择了 **内存目录树 + 单机 RocksDB + Raft 一致性机制** 的三层组合，在性能、规模和稳定性之间取得平衡。
+
+| 层次 | 核心职责 | 设计动机 |
+| --- | --- | --- |
+| 内存目录树 | 存储目录结构信息，包括目录名、父子关系，并处理路径解析、目录列举等高频操作 | 将高频命名空间操作放在内存中，把目录查询和路径匹配延迟控制在微秒级；只维护轻量目录结构，最大化可支撑规模 |
+| 元数据 RocksDB（`inode` 引擎） | 持久化文件和目录的完整元数据，包括文件大小、权限、`mtime`、block 位置以及完整目录关系 | 通过列族机制拆分不同类型的元数据，提升读写效率，并更好地适配频繁的元数据更新 |
+| Raft 日志 RocksDB | 持久化所有元数据修改日志，包括创建、删除、更新等操作，并按顺序用于多节点同步 | 将日志存储与元数据存储完全隔离，避免互相干扰，同时便于同步、压缩、清理和故障恢复 |
+
+### 🛡️ FsMode：与 UFS 协同，保障数据兜底安全
+
+Curvine 支持 **FsMode**，会将元数据和文件数据同步到底层统一文件系统（UFS），形成**本地存储 + 磁盘兜底**的双重保障，在不影响系统性能的前提下避免数据丢失。
+
+## 🚀 未来演进方向
+
+Curvine 的元数据能力还会继续向前推进，重点包括三个方向：
+
+1. **单机百亿**：继续深挖单机能力，让普通 **512G 内存** 机器也能支撑 **百亿级文件元数据**。
+2. **联邦 Federation**：增强元数据集群扩展性，采用类似 HDFS Federation 的模式，通过目录拆分支撑 **千亿级以上** 的规模。该模式对 `mv`、`ls` 等集中式元数据操作尤其友好，但需要在初始化时规划目录结构。
+3. **插件式元数据管理**：抽象元数据接口，支持插件化元数据后端，进一步提升灵活性和适配能力。
+
+## 📚 参考材料
+
+1. https://mp.weixin.qq.com/s/zbBUQ4P53PPWQjOHQmw8uw
+2. https://hadoop.apache.org/docs/r3.4.0/hadoop-project-dist/hadoop-hdfs-rbf/HDFS%20RouterFederation.html
+
+### 👇 关注我们
+
+我们会持续分享分布式存储、元数据优化和高并发压测等实战内容。
+
+GitHub：https://github.com/CurvineIO/curvine
diff --git a/...ocusaurus-plugin-content-blog/2026-04-06-curvine-metadata-benchmark/latency.png b/...ocusaurus-plugin-content-blog/2026-04-06-curvine-metadata-benchmark/latency.png
diff --git a/...n-content-blog/2026-04-06-curvine-metadata-benchmark/master-memory-recovery.png b/...n-content-blog/2026-04-06-curvine-metadata-benchmark/master-memory-recovery.png
diff --git a/...plugin-content-blog/2026-04-06-curvine-metadata-benchmark/memory-efficiency.png b/...plugin-content-blog/2026-04-06-curvine-metadata-benchmark/memory-efficiency.png
diff --git a/...in-content-blog/2026-04-06-curvine-metadata-benchmark/metadata-architecture.png b/...in-content-blog/2026-04-06-curvine-metadata-benchmark/metadata-architecture.png
diff --git a/...cn/docusaurus-plugin-content-blog/2026-04-06-curvine-metadata-benchmark/qps.png b/...cn/docusaurus-plugin-content-blog/2026-04-06-curvine-metadata-benchmark/qps.png
diff --git a/...in-content-blog/2026-04-06-curvine-metadata-benchmark/small-file-count-1500.png b/...in-content-blog/2026-04-06-curvine-metadata-benchmark/small-file-count-1500.png
diff --git a/...in-content-blog/2026-04-06-curvine-metadata-benchmark/small-file-count-1600.png b/...in-content-blog/2026-04-06-curvine-metadata-benchmark/small-file-count-1600.png