Skip to content

Persistent file handles in VFS write path#5764

Open
kounelisagis wants to merge 5 commits intomainfrom
agis/buffered-io
Open

Persistent file handles in VFS write path#5764
kounelisagis wants to merge 5 commits intomainfrom
agis/buffered-io

Conversation

@kounelisagis
Copy link
Member

Every VFS::write() to a local file opened and closed the file descriptor (POSIX) or HANDLE (Windows) on each call. For write patterns with many small sequential writes (e.g., tile-at-a-time ingestion), the syscall overhead adds up significantly.

This change introduces persistent file handle caching at the local filesystem layer. The fd/HANDLE is opened on the first write(), kept alive across subsequent writes to the same path, and only fsync()'d and closed on flush(). The destructor cleans up any handles that were not explicitly flushed.

Standalone microbenchmark (macOS, Apple Silicon) comparing per-write open/close versus cached file handle:

Workload Before After Speedup
1,000 × 4 KB writes 36.7 ms 17.2 ms 2.1×
5,000 × 4 KB writes 224 ms 94.6 ms 2.4×
10,000 × 512 B writes 217 ms 66.5 ms 3.3×

Measured with 10 trials (3 warmup) writing sequentially to a single file using pwrite(), comparing per-write open()/close() against a single cached fd, without fsync between writes (only at final flush).

The performance improvement increases as write size decreases (i.e., when the workload is more syscall-dominated), which matches the typical compressed-tile scenario.

Ref: #5732 (comment)
Closes CORE-497


TYPE: IMPROVEMENT
DESC: Persistent file handles in VFS write path.

@kounelisagis kounelisagis requested a review from rroelke February 24, 2026 11:08
@teo-tsirpanis
Copy link
Member

Can you share your benchmark code? How many threads does it use to write files?

Copy link
Member

@teo-tsirpanis teo-tsirpanis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to fix redundant file opens, we should more comprehensively refactor the VFS write API into something like this, which is much more performant and maintainable:

VFS vfs;
auto writer = vfs.create_writer(uri);
writer.write(buf1, len1);
writer.write(buf2, len2);
writer.write(buf3, len3);
writer.flush();

The existence of global1 state in the VFS (and the class design that forces it) is an identified problem of this part of the library, and doubling down on it is not a move in the right direction, and will lead to bugs, performance issues2, and more technical debt down the road.

Footnotes

  1. insofar as across operations in each VFS instance

  2. I am not sure that performance would improve in real-world multi-threaded scenarios, due to the increased synchronization.

void Win::remove_dir(const URI& uri) const {
auto path = uri.to_path();
if (is_dir(uri)) {
evict_cached_handles(path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a bad idea, and subverts the operating system's file locking semantics. What would happen if another thread is writing to a file inside the directory at the same time?

Comment on lines +312 to +316
/** State for a file descriptor kept open between write() and flush(). */
struct OpenFile {
int fd;
uint64_t offset;
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The struct should have a destructor (and be un-copiable and immovable).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The newly added logic should be moved to local.cc as much as currently possible, to reduce duplications across POSIX and Windows.

for (auto& [path, of] : open_files_) {
::close(of.fd);
}
open_files_.clear();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
open_files_.clear();

No need to clear the map; it's going to be freed very soon either way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants