Skip to content

Add support for memory pages compression#2895

Open
rst0git wants to merge 17 commits into
checkpoint-restore:criu-devfrom
rst0git:2026-02-17-pages-compression
Open

Add support for memory pages compression#2895
rst0git wants to merge 17 commits into
checkpoint-restore:criu-devfrom
rst0git:2026-02-17-pages-compression

Conversation

@rst0git
Copy link
Copy Markdown
Member

@rst0git rst0git commented Feb 18, 2026

This pull request extends CRIU with support for LZ4 compression of
memory pages during dump, pre-dump, and restore. Memory pages are compressed individually and their compressed sizes are stored in the pagemap entry. During restore, the corresponding file offsets in the pages file are computed by adding the compressed sizes. This approach preserves support for optimizations such as iterative checkpointing, lazy pages, page server migration, and image streaming.

Before compression, zero-filled pages are detected and skipped entirely, while memory pages with low compression ratio are stored raw to avoid unnecessary decompression overhead on restore. When compression is used with the page server, pages are compressed before being sent over the network to reduce the amount of data transferred during live migration.

On restore, a helper daemon handles decompression since the PIE restorer cannot link against external libraries like LZ4. In the future, this daemon can be extended to support decryption of memory pages as well.

This pull request also includes an compress/decompress extension of the CRIT tool for converting checkpoint images offline, a benchmark tool for for measuring compression performance and storage impact, and ZDTM tests covering various memory page content and mapping combinations.

Comment thread criu/compression.c Fixed
Comment thread criu/page-xfer.c Fixed
Comment thread criu/page-xfer.c Fixed
Comment thread criu/pagemap.c Fixed
Comment thread criu/pagemap.c Fixed
Comment thread criu/pagemap.c Fixed
Comment thread criu/pagemap.c Fixed
Comment thread criu/pagemap.c Fixed
Comment thread criu/compression.c
Comment thread criu/compression.c Outdated
Comment thread criu/compression.c Outdated
@mihalicyn mihalicyn self-requested a review February 19, 2026 12:48
@rst0git rst0git force-pushed the 2026-02-17-pages-compression branch from 5c86e95 to 50b748b Compare March 15, 2026 02:14
Comment thread contrib/criu-compression-benchmark.py Fixed
Comment thread contrib/criu-compression-benchmark.py Fixed
Comment thread contrib/criu-compression-benchmark.py Fixed
Comment thread contrib/compression-benchmark/main.py Fixed
Comment thread contrib/compression-benchmark/main.py Fixed
Comment thread contrib/criu-compression-benchmark.py Fixed
Comment thread contrib/criu-compression-benchmark.py Fixed
Comment thread contrib/criu-compression-benchmark.py Fixed
Comment thread contrib/compression-benchmark/main.py Fixed
Comment thread contrib/criu-compression-benchmark.py Fixed
Comment thread criu/compression.c Fixed
Comment thread criu/cr-restore.c Fixed
@rst0git rst0git force-pushed the 2026-02-17-pages-compression branch 6 times, most recently from ee2618a to 7ac3c61 Compare March 15, 2026 08:51
@rst0git rst0git marked this pull request as ready for review March 15, 2026 08:52
Copilot AI review requested due to automatic review settings March 15, 2026 08:52
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds optional LZ4-based compression for memory pages images (pages.img) in CRIU, recording per-page compressed sizes in the pagemap image so restore can locate and decompress pages correctly (including for streaming, pre-dump chains, and page-server flows).

Changes:

  • Introduces --compress/RPC support and persists the setting in inventory images.
  • Extends pagemap images with compressed_size[] and total_compressed_size, and updates dump/restore page I/O paths (including a helper daemon for PIE restore).
  • Updates ZDTM and CI scripts to exercise compressed dumps/restores.

Reviewed changes

Copilot reviewed 33 out of 33 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
criu/compression.c New LZ4 compression/decompression + helper daemon for restore-time decompression.
criu/include/compression.h Public compression API and page-size bound macro.
criu/include/cr_options.h Adds pages_compression option.
criu/config.c Adds -c/--compress option parsing.
criu/cr-service.c Wires RPC option to enable compression.
criu/crtools.c Adds CLI help text for --compress (under CONFIG_LZ4).
criu/page-xfer.c / criu/include/page-xfer.h Implements compressed write path (local + page-server receive side buffering).
criu/pagemap.c / criu/include/pagemap.h Implements compressed read paths (local + streaming) and carries compressed metadata into restorer args.
criu/mem.c Starts helper daemon and passes pipe fds to restorer.
criu/pie/restorer.c / criu/include/restorer.h Adds compressed restore path via pipe protocol to helper daemon.
criu/cr-restore.c Fixes up restorer pointers for compressed_size arrays.
criu/image.c Persists compression setting in inventory.img and enables it on restore when present.
images/pagemap.proto Adds compressed_size[] and total_compressed_size.
images/inventory.proto Adds pages_compression to inventory entry.
images/rpc.proto Adds RPC compress boolean option.
criu/unittest/unit.c / criu/Makefile* Adds unit test coverage and build integration for compression module.
test/zdtm.py / scripts/ci/run-ci-tests.sh Adds --compress wiring and CI test runs.
Makefile.config / dependency scripts Adds LZ4 feature detection and distro package dependencies.
Documentation/criu.txt Documents --compress.
contrib/criu-compression-benchmark.py Adds benchmarking script for compression impact.

Comment thread criu/mem.c Outdated
Comment thread criu/pie/restorer.c Outdated
Comment thread criu/pagemap.c Outdated
Comment thread criu/config.c Outdated
Comment thread criu/compression.c Outdated
Comment thread criu/mem.c Outdated
Comment thread criu/compression.c Outdated
Comment thread criu/pagemap.c
Comment thread criu/page-xfer.c Outdated
Comment thread criu/page-xfer.c Outdated
@rst0git rst0git force-pushed the 2026-02-17-pages-compression branch 2 times, most recently from 3ad2776 to da288db Compare March 17, 2026 11:15
@rst0git rst0git force-pushed the 2026-02-17-pages-compression branch from da288db to 8b1ef89 Compare April 1, 2026 15:08
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 1, 2026

Codecov Report

❌ Patch coverage is 35.79545% with 791 lines in your changes missing coverage. Please review.
✅ Project coverage is 56.59%. Comparing base (b0b89a9) to head (4889093).

Files with missing lines Patch % Lines
criu/pagemap.c 17.11% 397 Missing ⚠️
criu/compression.c 11.29% 267 Missing ⚠️
criu/page-xfer.c 81.96% 44 Missing ⚠️
criu/mem.c 0.00% 28 Missing ⚠️
criu/config.c 47.72% 23 Missing ⚠️
criu/cr-service.c 9.52% 19 Missing ⚠️
criu/cr-restore.c 0.00% 6 Missing ⚠️
criu/cr-dump.c 42.85% 4 Missing ⚠️
criu/image.c 83.33% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           criu-dev    #2895      +/-   ##
============================================
- Coverage     57.26%   56.59%   -0.67%     
============================================
  Files           154      156       +2     
  Lines         40444    41666    +1222     
  Branches       8866     9146     +280     
============================================
+ Hits          23161    23582     +421     
- Misses        17019    17820     +801     
  Partials        264      264              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@rst0git rst0git force-pushed the 2026-02-17-pages-compression branch from 8b1ef89 to 6682d0a Compare April 14, 2026 15:16
@rst0git rst0git force-pushed the 2026-02-17-pages-compression branch 2 times, most recently from 6a5785d to ab5bd78 Compare May 1, 2026 13:01
Comment thread contrib/criu-compression-benchmark.py Fixed
Comment thread criu/page-xfer.c
total_blocks = nr_pages;
}

if ((uint64_t)total_blocks * sizeof(uint32_t) > SIZE_MAX / 2) {
@rst0git rst0git force-pushed the 2026-02-17-pages-compression branch from ab5bd78 to 7dee7b2 Compare May 1, 2026 13:49
try:
with open(p, "rb") as f:
chunks.append(f.read())
except OSError:
@rst0git rst0git force-pushed the 2026-02-17-pages-compression branch 3 times, most recently from 728fcd2 to 450cc01 Compare May 1, 2026 15:47
rst0git added 17 commits May 13, 2026 09:39
Add build system plumbing for LZ4 compression. When liblz4 is found
via pkg-config, CONFIG_LZ4 is defined and the library is linked.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Add the protobuf fields used to encode memory page compression both
in images and on the wire.

- (inventory) uint32 compress: compression mode for the dump,
  encoded with enum compress_mode values: 0 = off, 1 = per-page,
  2 = region. Lets the restore side detect and reproduce the
  compression encoding automatically.

- (pagemap) repeated uint32 compressed_size: per-block compressed
  size array. Each value is the number of bytes the compressed
  block occupies in the pages image. In per-page mode each block
  is one page; in region mode each block covers up to region_pages
  consecutive pages. Sentinel values: 0 = all-zero block (no
  payload is stored), block bytes = stored raw (no decompression
  needed), anything else = LZ4-compressed block of that size.

- (pagemap) uint64 total_compressed_size: sum of compressed_size[].
  Used to size the read in one pread(); uint64 is needed because a
  single pagemap entry can cover millions of pages and the sum can
  exceed 4 GiB.

- (pagemap) uint32 region_pages: number of pages per compressed
  block in region mode. Absent or 0 means per-page compression.

- (rpc) uint32 compress: same encoding as the inventory field.
- (rpc) uint32 compress_acceleration: LZ4 acceleration value.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Add compression.h with the public helpers used by the dump and
restore paths:

- compress_data() / decompress_data(): per-page LZ4 round-trip.
- compress_region() / decompress_region(): multi-page region LZ4
  round-trip with built-in zero-region detection (returns 0) and a
  store-raw fallback (returns block_bytes) when the region does
  not compress below REGION_COMPRESSION_THRESHOLD.
- page_is_all_zero(): fast zero-page detection using unsigned long
  comparison, mirroring is_folio_zero_filled() in the kernel.

The header also exports:

- enum compress_mode { COMPRESS_OFF, COMPRESS_PER_PAGE,
  COMPRESS_REGION }.
- PAGE_COMPRESSED_SIZE_BOUND, REGION_COMPRESSED_SIZE_BOUND(n_pages)
  -- LZ4 worst-case output size for one page or for a region of
  n_pages pages.
- PAGE_COMPRESSION_THRESHOLD,
  REGION_COMPRESSION_THRESHOLD(region_bytes) -- store-raw thresholds.
- LZ4_DEFAULT_ACCELERATION, LZ4_MAX_ACCELERATION.
- MAX_REGION_PAGES (1024), DEFAULT_REGION_PAGES (64).

Stubs are provided for builds without CONFIG_LZ4.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Add CLI options to enable memory page compression and the
corresponding feature check used to gate the LZ4 build.

  -c, --compress              enable per-page LZ4 compression
  --compress-region SIZE      enable region LZ4 compression with
                              the given region size; SIZE accepts
                              K/M/G suffixes (e.g. 256K, 1M)
  --compress-acceleration N   LZ4 acceleration; implies --compress
                              if no other mode is set
  criu check --feature compress

The selected mode is stored in opts.compress_mode (enum compress_mode
value) and persisted in the inventory image so that the restore
side detects the encoding automatically. When CRIU is built without
CONFIG_LZ4, the option is rejected early in check_options() with a
clear error message. --compress-region is also rejected when used
with --page-server or --stream, because those wire formats are
per-page only.

The RPC interface accepts the same options via the compress,
compress_acceleration and compress_region_size fields.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Add the local-image and page-server write paths for memory page
compression.

write_pagemap_loc_compressed() and write_pages_loc_compressed()
buffer per-block compressed sizes into pending_pe and flush a
PagemapEntry once all blocks of an iovec have been compressed. The
loop body is parameterised on pending_pe.region_pages: when 0,
each page is compressed independently; when non-zero, pages are
accumulated into regions of region_pages and compressed as a
single LZ4 block. Zero pages and zero regions are stored with
compressed_size=0 (no image payload); blocks that do not compress
below the 7/8 store-raw threshold are written verbatim.

For the page server, add PS_IOV_ADD_F_COMPRESSED and
write_pages_to_server_compressed(): pages are compressed before
being sent over the network and the receiver writes the
compressed bytes to the local image without re-compressing.

write_fd_full() handles short writes on the pages image.
close_page_xfer() frees pending_pe.compressed_size on error paths;
it is initialised to NULL so the unused-branch close is a no-op.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Add comments to the page_read function pointers and data fields.
No functional changes.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
The PIE restorer cannot link against LZ4, so a helper daemon
process handles decompression. The daemon is forked in
prepare_vma_ios() and communicates with the restorer over a pair
of pipes.

Wire protocol header (struct pipe_hdr):

  pid_t    remote_pid;
  off_t    offs;                /* file offset in pages.img    */
  uint64_t total_compressed_size;
  int      n_pages;             /* total pages in request      */
  int      nr_iovs;             /* number of destination iovecs */
  int      n_blocks;             /* count of compressed_size[]  */
  uint32_t region_pages;        /* 0 = per-page, >0 = region   */

After the header come compressed_size[n_blocks]; in region mode the
daemon then reads block_pages[n_blocks] (uint16 per block) giving
each block's actual page count (the last block of an entry may be
shorter than region_pages). The remote-destination iovs[nr_iovs]
follow last.

The daemon reads compressed data with a single pread() per request,
decompresses block-by-block (one page in per-page mode, up to
region_pages pages in region mode), and writes the result into the
target process via process_vm_writev(). Zero pages are not written
at all; the target process VMAs are MAP_ANONYMOUS, so unwritten
pages remain on the kernel zero page and do not consume physical
memory.

The decompression buffer is mmap(MAP_ANONYMOUS) with MADV_HUGEPAGE
to enable the fast GUP path in process_vm_writev() and to reduce
TLB misses. MADV_DONTNEED re-zeros the buffer between requests.
posix_fadvise(FADV_DONTNEED) is called after each batch read to
release page cache for already-read compressed data.

Per-block compressed sizes (and per-block page counts in region
mode) are validated against the corresponding bounds before use to
prevent out-of-bounds reads from corrupted images. Negative
n_pages/nr_iovs/n_blocks values are rejected. The
process_vm_writev() iovec count is capped at IOV_MAX per call.

Pipe I/O uses pipe_write_full()/pipe_read_full() in the PIE restorer
and read_full() in the daemon to handle short reads and writes on
pipe buffer boundaries.

The daemon PID is stored in decompress_daemon_pid in
task_restore_args instead of appending to the helpers array, which
would corrupt the array built by collect_helper_pids(). The
restorer waits for the daemon explicitly after closing the pipes.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Add maybe_read_page_local_compressed() and
maybe_read_page_img_streamer_compressed() for restoring compressed
pages from local images and streaming pipes respectively.

Both readers fall back to the uncompressed path when a pagemap
entry has no compressed_size array, which happens with shared
memory pagemaps or entries from uncompressed parent images. They
also dispatch on pe->region_pages: per-page mode uses
read_compressed_pages(), which decompresses page-by-page directly
into the destination buffer; region mode uses
read_compressed_pages_region(), which decompresses an entire block
(up to region_pages pages) into a heap scratch buffer and copies
the requested page slice into the destination iovec, supporting
partial-region reads via an in-block cursor (region_block_offset).

skip_pagemap_pages() advances pi_off by summing per-block
compressed sizes; in region mode it walks block-by-block and
keeps region_block_offset consistent so partial-region skips
remain correct.

Per-block compressed sizes are validated against
PAGE_COMPRESSED_SIZE_BOUND or REGION_COMPRESSED_SIZE_BOUND(n_pages)
as appropriate. Zero blocks (compressed_size=0) are restored with
memset. The pread() calls loop to handle short reads.

The PR_ASYNC flag is supported. Compressed reads are enqueued via
pagemap_enqueue_iovec(); coalescing requires matching region_pages
between piovs. process_async_reads() reads all compressed data in
one pread() call and decompresses block-by-block into the
destination iovecs, with a direct-into-iovec fast path in region
mode when a block fits inside a single destination slot.

posix_fadvise(FADV_SEQUENTIAL) is applied to the pages image fd to
hint the kernel for aggressive readahead.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
process_async_reads() allocates a single buffer for all compressed
data in a piov batch. When pages coalesce into one giant piov (common
with large GPU checkpoints), the buffer can exceed host memory.

For example, checkpointing LLaMA 3.1-8B running on A100-SXM4-80GB has 77
GB of memory and produces ~72 GiB of compressed data. Thus, without this
patch it would require 72 GiB for the decompression buffer and 77 GiB of
premapped pages: 149 GiB total. This can exceed host memory and result
in OOM during restore.

Cap compressed piov batches at 1 GiB of compressed data during
coalescing in pagemap_enqueue_iovec(). Larger checkpoints split into
multiple batches, each allocating a bounded decompression buffer.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Wire up the memory page compression options through the zdtm test
framework for both CLI and RPC modes:

  -c, --compress
  --compress-region SIZE        (K/M/G suffix accepted)
  --compress-acceleration N

The page-count validation auto-detects compression from the test
descriptor opts, so the flags work whether they come from the CLI
or from a .desc file.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Add zdtm tests that verify memory page content after
checkpoint/restore with compression, in both per-page and region
modes:

compress_pages00 / compress_pages_region00: single process with
zero-filled pages, compressible pattern pages, and incompressible
random pages. Exercises all three compression outcomes (zero-skip,
LZ4 compressed, raw fallback).

compress_pages01 / compress_pages_region01: parent/child process
tree with copy-on-write pages. Parent fills 64 pages, child
modifies 16 of them. After restore, both parent and child verify
their respective views byte-by-byte.

compress_pages02 / compress_pages_region02: eight different
mapping types in a parent/child tree -- MAP_PRIVATE anonymous
(data and zeros), MAP_SHARED anonymous, private and shared
file-backed, memfd shared, read-only (PROT_READ after mprotect),
and PROT_NONE guard page adjacent to a data page.

The compress_pages_region* siblings share C source with the
per-page tests (via symlinks) and differ only in their .desc opts
string. All tests use the compress feature check to auto-skip when
CRIU is built without LZ4. The .desc files set --compress (-c) or
--compress-region=256K so compression is always active and the
tests run with --pre, --page-server, --lazy-pages, --stream, etc.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Add compression test coverage to the CI script:
- iterative checkpointing with compression
- iterative + dedup, iterative + page-server
- compress_pages tests in basic, iterative, page-server, dedup, and
  lazy-pages modes
- streaming tests with compress_pages
- mixed-compression parent chain test

Add test/others/compress-mixed/ which tests mixed-compression parent
chains: two uncompressed pre-dumps followed by a compressed final dump,
then restore. This exercises the per-entry fallback in the compressed
reader when parent pagemap entries have no compressed_size array.

Add shellcheck coverage for test/others/compress-mixed/.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Add a Python benchmark that measures the storage and performance
impact of memory page compression across the configured modes and
data patterns.

Layout:
  contrib/compression-benchmark/main.py     -- driver and reporter
  contrib/compression-benchmark/workload.py -- pattern generators;
                                               also runs as the
                                               long-lived workload
                                               process under criu.

main.py imports workload.fill_pattern() so the SHA-256 the driver
expects after restore is computed from the same code that wrote
the bytes inside the workload, avoiding any drift between the two
sides of the integrity check.

Sweeps compression mode (none / per-page / region) and, for region
mode, region size (default 64 K, 256 K, 1 M). Workload patterns:
zero (highly compressible), mixed (50% zero / 25% repeating /
25% random), random (incompressible), text (JSON-shaped), elf
(concatenated system binaries). Reports compression ratio, dump
and restore latency (median with interquartile range), throughput,
and CRIU stats counters; validates memory integrity via SHA-256
across each restore.

Usage:
  sudo python3 contrib/compression-benchmark/main.py
  sudo python3 contrib/compression-benchmark/main.py \
       -p mixed text elf --modes none per-page region \
       --region-sizes 65536 262144 1048576 --json out.json

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Add offline tools to convert checkpoint images between compressed and
uncompressed formats:

  crit compress <dir>     -- compress memory pages with LZ4
  crit decompress <dir>   -- decompress memory pages

By default, original files are backed up as .bak. Use --in-place to
skip backups. The --acceleration flag controls LZ4 speed/ratio
trade-off.

Requires the Python lz4 package (optional dependency, added to all
package manager dependency lists). When lz4 is not installed, other
crit commands work normally and the compress/decompress commands
print install instructions.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Add five tests covering compress and decompress round-trips using
compress_pages02 which exercises all eight mapping types (anonymous,
zeros, shared, file-backed, memfd, read-only, guard pages).

- compressed dump, decompress with crit, restore and verify
- uncompressed dump, compress with crit, restore and verify
- compress already compressed, decompress already decompressed
- compress, decompress, compress, verify pages are identical
- decompress, compress, decompress, verify pages are identical

Each restore runs the test process which verifies all memory regions
byte-by-byte. The round-trip tests also compare md5 checksums of the
raw pages data across cycles.

When lz4 or CRIU compression support is not available, the tests are
skipped gracefully.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
Test compress_data() / decompress_data() with zero-filled,
repeating pattern, pseudo-random, and single-byte pages across
three LZ4 acceleration levels.

Test compress_region() / decompress_region() with the same
patterns at region sizes {16, 64, 256} pages and acceleration
levels {1, 4, 32}, including an "all zeros except one non-zero
page" case to exercise the zero pre-pass fast path and per-page
zero detection inside the decompression result.

Also test page_is_all_zero() edge cases.

Signed-off-by: Radostin Stoyanov <rstoyanov@fedoraproject.org>
@rst0git rst0git force-pushed the 2026-02-17-pages-compression branch from 450cc01 to 4889093 Compare May 13, 2026 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants