Skip to content

Reduce RAM consumption when downdload/upload file to file_size*2#28

Closed
vetal4444 wants to merge 1 commit into
Intrinsec:mainfrom
vetal4444:save_memory
Closed

Reduce RAM consumption when downdload/upload file to file_size*2#28
vetal4444 wants to merge 1 commit into
Intrinsec:mainfrom
vetal4444:save_memory

Conversation

@vetal4444

Copy link
Copy Markdown

The current implementation consumes about 3× the file size when uploading to S3 (tested with a 1 GB file, memory usage was ~3.2 GB). Downloading is even worse: s3proxy is consistently killed by the OS OOM killer when trying to download a 1 GB file from S3. At that moment the server had 5–6 GB of free RAM, and s3proxy consumed all of it.

With the changes from this PR, s3proxy consumes about 2× the file size for both uploads and downloads.

@slig2008

Copy link
Copy Markdown

Just in case somebody is interested... I have created another fork here and included this PR respectively.

ynsta added a commit that referenced this pull request May 19, 2026
Multi-GB GET/PUT held plaintext and ciphertext simultaneously and
left the process RSS near peak long after each request, which tripped
OOM kills on hosts with tight memory budgets. Drop the slice
references between phases (post-decrypt, post-write, post-encrypt,
post-upload) and call runtime/debug.FreeOSMemory past a 100 MiB
threshold so freed pages actually return to the OS.

Also stop forwarding an empty x-amz-server-side-encryption header on
zero-byte PutObject responses; some upstreams (e.g. Hetzner Object
Storage) return an empty algorithm there and strict SDK clients
choke on the empty value.

Refs #28
@ynsta

ynsta commented May 20, 2026

Copy link
Copy Markdown
Contributor

Thanks @vetal4444 — the RAM-reduction work landed in #41 (released as part of v1.5.0+ / current main). Verified each technique against current upstream:

Change from this PR Where it lives on main today
Content-Length preallocation + io.ReadFull for body reads s3proxy/internal/router/body.go (readBody helper) and s3proxy/internal/s3/s3.go:60-72 (middleware)
Skip cloning successful response bodies in the capture middleware (status < 400) s3proxy/internal/s3/s3.go:83-86
Nil the slice + debug.FreeOSMemory() when buffer ≥ 100 MiB s3proxy/internal/router/object.go:32-46 (releaseLargeBuffer) wired into 5 hot sites (request body, plaintext, ciphertext, etc.)

Future RAM work. Even with these patches the proxy still buffers entire bodies in RAM — peak is roughly 2× object_size × concurrent_requests. The real ceiling is removed by switching to a streaming AEAD construction, processing 64 KiB – 1 MiB chunks directly between client and upstream. That work is tracked as PR E in #43. It is a breaking on-disk format change (streaming AEAD ciphertext layout is not interchangeable with the current one-shot AES-GCM-SIV), so it needs a migration design before any code lands.

Closing this PR as integrated. The signal in your benchmark is what motivated the rewrite — really appreciated.

@ynsta ynsta closed this May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants