bench: add criterion benchmark suite and CI workflow#10444
Open
oxarbitrage wants to merge 8 commits intomainfrom
Open
bench: add criterion benchmark suite and CI workflow#10444oxarbitrage wants to merge 8 commits intomainfrom
oxarbitrage wants to merge 8 commits intomainfrom
Conversation
Add criterion benchmarks for Sprout JoinSplit Groth16 proof verification in zebra-consensus. Measures single and unbatched verification at batch sizes 2–64, plus input preparation costs (primary_inputs computation and item creation). Uses cycled items from mainnet test blocks since verification cost is constant per proof regardless of content.
Add criterion benchmarks for Orchard Halo2 proof verification in zebra-consensus. Extracts real Orchard bundles from NU5 mainnet test blocks and measures single and unbatched verification at batch sizes 2-32. Only exercises verify_single() since Item fields and the batch trait are private.
Add criterion benchmarks for Sapling shielded data verification in zebra-consensus. Extracts real Sapling bundles from mainnet test blocks (28 items) and measures both unbatched (one-item batch per bundle) and true batch verification at sizes 2-64. Batch verification shows ~5x speedup at 64 bundles, validating the batching architecture.
Add criterion benchmarks for transaction deserialization and serialization across all five Zcash transaction versions (V1-V5). Extracts real transactions from mainnet test blocks at the appropriate network upgrade heights. V5 deserialization is notably slower than V1-V4 due to consensus branch ID validation and Orchard field parsing.
Adds a workflow_dispatch workflow that runs the full benchmark suite using cargo-criterion and stores results via github-action-benchmark on the gh-pages branch for historical tracking. Features: - Selective benchmarks via comma-separated input (or 'all') - Configurable regression alert threshold - Step summary table visible in the Actions UI - Converts cargo-criterion JSON to customSmallerIsBetter format
When the "C-benchmark" label is added to a PR, runs all benchmarks on both the base and PR branches, then posts a critcmp comparison table as a PR comment. Updates the existing comment on re-runs.
- Pin github-action-benchmark to SHA - Move GitHub context expressions to env blocks to prevent injection - Remove unused env.BENCH_COMMAND - Apply cargo fmt to benchmark files
Remove diagnostic eprintln! calls from benchmark files (flagged by clippy print_stderr lint) and remove the resulting unused total_actions variable in halo2.rs. Co-Authored-By: Claude Opus 4.6 <[email protected]>
6 tasks
Contributor
Author
Follow-up: Additional Benchmark CandidatesAnalysis of the sync hot path identified several operations that are not yet benchmarked. These are candidates for follow-up PRs, prioritized by expected impact on sync time. High Priority
Medium Priority
Easiest Next StepsEquihash and sighash are the most straightforward to add — they are pure computation with no database or FFI setup required, and can reuse the existing test vector pattern from the current benchmarks. Script validation and UTXO lookups are the biggest real-world sync bottlenecks but require more setup (FFI + previous outputs for scripts, populated RocksDB for UTXOs). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Zebra had minimal benchmarking coverage (only block serialization and RedPallas signatures). This makes it difficult to evaluate PRs for performance regressions or improvements — for example, PR #10436 removes groth16 abstractions but had no way to verify it was performance-neutral.
Solution
Adds a comprehensive benchmark suite covering Zebra's most expensive code paths, plus CI automation for tracking and comparison.
New benchmarks
zebra-consensus/benches/groth16.rs) — Sprout JoinSplit proof verification, single and unbatched at sizes 2-64, plus input preparation costzebra-consensus/benches/halo2.rs) — Orchard proof verification, single and unbatched at sizes 2-32zebra-consensus/benches/sapling.rs) — both unbatched and true batch verification at sizes 2-64 usingsapling_crypto::BatchValidatorzebra-chain/benches/transaction.rs) — per-version (V1-V5) deserialization and serialization from real mainnet blocksCI workflow (
.github/workflows/benchmarks.yml)workflow_dispatch: runs all benchmarks, stores results ongh-pagesviagithub-action-benchmark, generates a summary table in the Actions UIC-benchmarklabel to a PR runs benchmarks on both base and PR branches, posts acritcmpcomparison table as a PR commentBenchmark dashboard
Adds a step to
book.ymlthat copies benchmark data fromgh-pagesinto the docs output, making the historical trend chart available atzebra.zfnd.org/dev/bench/.Test data
All benchmarks use real transactions from the hardcoded mainnet blocks in
zebra-test. Items are cycled (repeated) to fill larger batch sizes — this is valid because cryptographic verification cost is constant per proof regardless of specific bytes. This limitation is documented in each benchmark file.Alternatives considered
github-action-benchmarkandcritcmpwith a more sophisticated solution.We chose
github-action-benchmark+critcmpas a lightweight starting point with no external dependencies. Migration to Bencher or Codspeed can be evaluated as the suite matures and if CI runner variance becomes a problem.Test plan
cargo bench,cargo criterion --message-format=json)benchesbranch (successful run)gh-pagesbranchcritcmpto evaluate PR cleanup(zebra-consensus): remove remaining groth16 abstractions #10436 — found zero regressions and 8.65x V5 deserialization improvementAI disclosure
Used Claude Code for benchmark implementation, CI workflow development, and testing.