core/vm: port GEVM performance optimisations by kamuikatsurgi · Pull Request #2143 · 0xPolygon/bor

kamuikatsurgi · 2026-03-17T10:15:33Z

Description

Fixed-size [1024]uint256.Int array stack replacing slice-based stack, eliminating append/slice-header overhead on push/pop
Switch-dispatch fast path (runSwitch) with inlined hot opcodes and gas accumulation, gated behind --switch-dispatch flag
Code generator (gen_dispatch/) to produce interpreter_dispatch.go
Wire EnableSwitchDispatch through CLI, ethconfig, and blockchain to both block import and block production paths

…ispatch

…rf-optimisations

codecov · 2026-03-17T10:44:56Z

Codecov Report

❌ Patch coverage is 66.16400% with 359 lines in your changes missing coverage. Please review.
✅ Project coverage is 52.04%. Comparing base (256bc94) to head (4df1ff3).
⚠️ Report is 2 commits behind head on develop.

Files with missing lines	Patch %	Lines
core/vm/gen_dispatch/main.go	0.00%	277 Missing ⚠️
core/vm/interpreter_dispatch.go	89.08%	58 Missing and 21 partials ⚠️
miner/worker.go	75.00%	2 Missing ⚠️
core/vm/stack.go	96.55%	1 Missing ⚠️

❌ Your patch check has failed because the patch coverage (66.16%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #2143      +/-   ##
===========================================
+ Coverage    51.51%   52.04%   +0.52%     
===========================================
  Files          882      884       +2     
  Lines       154081   155161    +1080     
===========================================
+ Hits         79377    80750    +1373     
+ Misses       69526    69174     -352     
- Partials      5178     5237      +59

Files with missing lines	Coverage Δ
core/vm/interpreter.go	`59.76% <100.00%> (+4.82%)`	⬆️
eth/backend.go	`52.27% <100.00%> (+0.06%)`	⬆️
eth/ethconfig/config.go	`78.94% <ø> (ø)`
internal/cli/server/config.go	`65.03% <100.00%> (+0.05%)`	⬆️
internal/cli/server/flags.go	`100.00% <100.00%> (ø)`
core/vm/stack.go	`96.61% <96.55%> (+55.23%)`	⬆️
miner/worker.go	`72.49% <75.00%> (+3.60%)`	⬆️
core/vm/interpreter_dispatch.go	`89.08% <89.08%> (ø)`
core/vm/gen_dispatch/main.go	`0.00% <0.00%> (ø)`

... and 34 files with indirect coverage changes

Files with missing lines	Coverage Δ
core/vm/interpreter.go	`59.76% <100.00%> (+4.82%)`	⬆️
eth/backend.go	`52.27% <100.00%> (+0.06%)`	⬆️
eth/ethconfig/config.go	`78.94% <ø> (ø)`
internal/cli/server/config.go	`65.03% <100.00%> (+0.05%)`	⬆️
internal/cli/server/flags.go	`100.00% <100.00%> (ø)`
core/vm/stack.go	`96.61% <96.55%> (+55.23%)`	⬆️
miner/worker.go	`72.49% <75.00%> (+3.60%)`	⬆️
core/vm/interpreter_dispatch.go	`89.08% <89.08%> (ø)`
core/vm/gen_dispatch/main.go	`0.00% <0.00%> (ø)`

... and 34 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

claude · 2026-03-17T11:03:41Z

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

adamdossa · 2026-03-18T10:51:09Z

@claude review

claude · 2026-03-18T10:51:26Z

Claude finished @adamdossa's task in 3m 13s —— View job

Code Review: GEVM Performance Optimisations

Read and analyze core VM changes (stack, interpreter, dispatch)
Read and analyze code generator
Read and analyze config/wiring changes
Read and analyze tests
Post review findings

Overall this is a well-structured PR. The fixed-size stack, switch dispatch, and code generator are cleanly implemented. Operand ordering for all inlined opcodes (SUB, DIV, SDIV, ADDMOD, MULMOD, JUMPI, etc.) was verified against the original instructions.go implementations and is correct. The gas accumulator pattern with flushing at control-flow points is sound — unflushed gasAccum on error is safe because non-REVERT errors consume all gas in EVM.Call.

Below are the findings, ordered by severity:

1. Behavior change in `miner/worker.go` — block production now inherits full VM config (Medium)

miner/worker.go:2369-2373 — The new vmConfig() method copies the entire blockchain VM config:

func (w *worker) vmConfig() vm.Config {
    cfg := *w.chain.GetVMConfig()
    return cfg
}

Previously, block production used vm.Config{} (all fields zero/false). Now it inherits all fields: EnablePreimageRecording, StatelessSelfValidation, EnableWitnessStats, and crucially Tracer. This is likely more correct (the miner should respect the configured settings), but it's a semantic change beyond just wiring EnableSwitchDispatch.

Tracer safety: The !debug guard at interpreter.go:185 prevents runSwitch from being entered when a Tracer is set, so there's no correctness issue. But if --vmtrace is enabled, block production will now also trace, which could slow down block production. Consider whether this is the intended behavior, or if vmConfig() should explicitly clear the Tracer:

func (w *worker) vmConfig() vm.Config {
    cfg := *w.chain.GetVMConfig()
    cfg.Tracer = nil // block production should not trace
    return cfg
}

Fix this →

2. Memory footprint increase from fixed-size stack (Low — Informational)

core/vm/stack.go:40-43 — Each Stack is now ~32KB ([1024]uint256.Int = 1024 × 32 bytes + 8 bytes for top), allocated upfront regardless of actual usage. The old slice-based stack started small and grew dynamically.

With sync.Pool reuse this is acceptable in steady-state. Worst case with maximum EVM call depth (1024 nested calls), this is ~32MB of stack allocations. This is fine, but worth noting for operators monitoring memory usage.

3. `returnStack` doesn't zero the data array (Low — By Design)

core/vm/stack.go:49-52 — Only top is reset, leaving stale data in the array:

func returnStack(s *Stack) {
    s.top = 0
    stackPool.Put(s)
}

This is safe because: (a) Data() returns st.data[:st.top] so stale elements are never exposed, and (b) push writes before incrementing top. Zeroing 32KB on every return would negate the performance gains. No action needed — just confirming this is intentional.

4. Gas accumulator overflow analysis (Low — Verified Safe)

gasAccum is uint64. Between flushes (JUMP/JUMPI/JUMPDEST), the maximum per-opcode gas is GasSlowStep = 10. Since contract.Gas is bounded by the block gas limit (well under uint64 max), and gas is deducted from contract.Gas at flush points, gasAccum cannot overflow before the contract runs out of gas. No issue.

5. Code generator and generated code consistency (Nit)

The interpreter_dispatch.go has the correct // Code generated by gen_dispatch; DO NOT EDIT. header and //go:generate go run ./gen_dispatch/ directive in interpreter.go:19. Consider adding a CI check (or Makefile target) that re-runs go generate ./core/vm/ and verifies no diff, to catch cases where the generator is modified without regenerating.

6. SonarQube findings (Nit)

SonarQube flags 5% duplication (threshold 3%) and 1 security hotspot. The duplication is from the repetitive DUP1-16 and SWAP1-16 cases in the generated code — this is inherent to the switch-dispatch design and is the point (avoiding indirect calls). The security hotspot likely refers to the os.WriteFile in the generator, which only runs at code-gen time, not at runtime.

Summary

The core changes are sound. The operand semantics match the original instructions.go implementations, gas accounting is correct, and the switch dispatch is properly gated behind both the --switch-dispatch flag and the isShanghai && !isEIP4762 && !debug conditions. The main item worth discussing is finding #1 (whether vmConfig() should clear the Tracer for block production).

claude · 2026-03-18T11:31:38Z

miner/worker.go

+	cfg := *w.chain.GetVMConfig()
+	return cfg
+}
+
 // copyReceipts makes a deep copy of the given receipts.
 func copyReceipts(receipts []*types.Receipt) []*types.Receipt {
 	result := make([]*types.Receipt, len(receipts))


🔴 Bug: vmConfig() in miner/worker.go no longer clears the Tracer field when copying the blockchain's VM config. Commit a022b70 originally added cfg.Tracer = nil (with comment "tracing disabled, suitable for block production"), but commit 00f3fae ("chore: nits") accidentally removed it. When --vmtrace is configured, the live tracer leaks into block production, which sets debug=true and bypasses the new switch-dispatch fast path, negating the performance optimization this PR introduces.

Extended reasoning...

What the bug is

The vmConfig() method in miner/worker.go (lines 2371-2377) copies the blockchain's entire vm.Config struct but does not clear the Tracer field. This method is called in makeEnv, prepareWork, generateWork, and prefetchFromPool — all block production code paths.

How the Tracer gets set

When the --vmtrace CLI flag is configured, eth/backend.go (line 313) explicitly sets options.VmConfig.Tracer = t on the blockchain-level VM config:

if config.VMTrace != "" { t, err := tracers.LiveDirectory.New(config.VMTrace, traceConfig) // ... options.VmConfig.Tracer = t }

This means w.chain.GetVMConfig().Tracer is non-nil whenever --vmtrace is active.

Git history confirms this is a regression

Commit a022b70 ("fix: use blockchain's vm.Config instead of nil") introduced the correct implementation:

// vmConfig returns blockchain's vm config with tracing disabled, suitable for block production. func (w *worker) vmConfig() vm.Config { cfg := *w.chain.GetVMConfig() cfg.Tracer = nil return cfg }

The very next commit 00f3fae ("chore: nits") removed the cfg.Tracer = nil line and simplified the comment, losing the intentional safety guard.

Impact on the switch-dispatch fast path

In core/vm/interpreter.go, the new switch-dispatch optimization is gated on !debug, where debug is evm.Config.Tracer != nil. If the tracer leaks through vmConfig(), then debug=true for all block production, and the fast path is never taken — silently negating the core performance improvement of this PR for any node running with --vmtrace.

Precedent in the codebase

The codebase already follows this pattern in core/blockchain.go (lines 779-780):

// Disable tracing for prefetcher executions. vmCfg := bc.cfg.VmConfig vmCfg.Tracer = nil

This is the exact same defensive clearing that should be restored in vmConfig().

Step-by-step proof

User starts a node with --vmtrace (e.g., for live transaction logging).

eth/backend.go:313 sets options.VmConfig.Tracer = t (a live tracer).

Blockchain is initialized with this config, so blockchain.GetVMConfig().Tracer != nil.

Miner's vmConfig() copies the config via *w.chain.GetVMConfig() — the Tracer field carries over.

In interpreter.go, debug := evm.Config.Tracer != nil evaluates to true.

The condition !debug fails, so the switch-dispatch fast path is bypassed.

Block production falls back to the slower jump-table interpreter, losing the optimization.

Fix

Restore the line removed in 00f3fae:

func (w *worker) vmConfig() vm.Config { cfg := *w.chain.GetVMConfig() cfg.Tracer = nil return cfg }

This is not a bug; if we set --vmtrace, we'd want to skip the switch-dispatch fast path. This is a design choice.

We want the block producers to inherit the full vm.config, and decide which path to take.

claude · 2026-03-18T12:16:30Z

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

sonarqubecloud · 2026-03-18T17:29:02Z

Quality Gate failed

Failed conditions
1 Security Hotspot
3.4% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

claude · 2026-03-18T18:08:04Z

core/vm/gen_dispatch/main.go

+
+	case shapePushVal:
+		e.p("if stack.top >= 1024 {\n")
+		e.p("return nil, &ErrStackOverflow{stackLen: stack.top, limit: 1024}\n")
+		e.p("}\n")


🟡 Nit: Inlined opcodes (PUSH, DUP, PC, MSIZE, PUSH0) report ErrStackOverflow{limit: 1024} while the default fallback case reports ErrStackOverflow{limit: operation.maxStack} where maxStack is 1023 for these net-push-1 operations. The overflow check itself is correct in both paths (stack.top >= 1024 == sLen > 1023), so this is purely a cosmetic inconsistency in the error message. Consider using 1024 - 1 or a named constant to keep the limit field consistent with the jumpTable fallback.

Extended reasoning...

What the bug is

The code generator (gen_dispatch/main.go) emits ErrStackOverflow{stackLen: stack.top, limit: 1024} for all inlined opcodes that use the shapePushVal shape (lines 318-322), emitDup (line 344), emitPushGeneric (line 375), and emitPush (line 393). However, the default fallback case in the generated interpreter_dispatch.go (around line 749) uses ErrStackOverflow{stackLen: sLen, limit: operation.maxStack}, where operation.maxStack is computed as 1024 + pop - push.

For all affected net-push-1 opcodes (DUP1-16, PUSH0-32, PC, MSIZE), maxStack = 1024 + pop - push. For example, DUP1 pops 1 and pushes 2, so maxStack = 1024 + 1 - 2 = 1023. PUSH1 pops 0 and pushes 1, so maxStack = 1024 + 0 - 1 = 1023. The inlined code reports limit: 1024 while the default path reports limit: 1023 for the same operation.

Why the overflow check is still correct

The inlined path checks stack.top >= 1024 and the default path checks sLen > operation.maxStack (i.e., sLen > 1023). Since sLen is stack.top (an integer), stack.top >= 1024 is mathematically equivalent to stack.top > 1023. Both paths correctly prevent the stack from growing beyond 1024 items. The actual overflow detection is identical.

Step-by-step proof of the inconsistency

Assume the stack has exactly 1024 items (stack.top = 1024).

Inlined DUP1 path: checks stack.top >= 1024 → true → returns ErrStackOverflow{stackLen: 1024, limit: 1024}.

Default fallback for DUP1: computes sLen = stack.len() = 1024, checks sLen > operation.maxStack → 1024 > 1023 → true → returns ErrStackOverflow{stackLen: 1024, limit: 1023}.

Same trigger condition, same stackLen, but limit differs: 1024 vs 1023.

Impact assessment

This has zero functional impact:

Stack overflow errors consume all remaining gas regardless of the limit field value.

The differential test suite (dispatch_test.go) uses classifyErr which only checks error type via errors.As, not field values.

No downstream code in the codebase inspects the limit field for decision-making.

The Error() string representation would show (1024) vs (1023) but this string is never parsed.

Suggested fix

In gen_dispatch/main.go, the shapePushVal emitter (line 320), emitDup (line 344), emitPushGeneric (line 375), and emitPush (line 393) could use 1024 - 1 (i.e., 1023) for the limit field to match the operation.maxStack value that the default path would report. Alternatively, a comment explaining the intentional difference (reporting absolute capacity vs operation-specific limit) would suffice.

claude · 2026-03-19T12:43:30Z

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

cffls

Looks good overall. The patch coverage is 66%. Any ideas if we can bump it up?

cffls · 2026-03-23T15:25:11Z

core/vm/gen_dispatch/main.go

@@ -0,0 +1,594 @@
+// Code generator for interpreter_dispatch.go — the EVM switch dispatch.
+// Adapted from GEVM's gen/main.go (github.com/Giulio2002/gevm).


What's the SOP to update this file since it is taken from another repo? If there is an EVM update in the future, what should we do with it?

cffls · 2026-03-23T15:30:08Z

internal/cli/server/config.go

 	VMTraceJsonConfig string `hcl:"vmtrace.jsonconfig,optional" toml:"vmtrace.jsonconfig,optional"`

+	// Use switch-based fast path EVM interpreter
+	EnableSwitchDispatch bool `hcl:"switch-dispatch,optional" toml:"switch-dispatch,optional"`


Maybe add evm to this flag, e.g. evm-switch-dispatch? It would allow users to understand the flag without looking up the description/doc.

marcello33

LGTM, 2 minor comments

marcello33 · 2026-03-23T16:11:56Z

eth/ethconfig/config.go

 	StatelessSelfValidation bool

+	// Use switch-based fast path EVM interpreter
+	EnableSwitchDispatch bool


Do we need to add this in the generated TOML codec for the marshal/unmarshal helpers? I mean in eth/ethconfig/gen_config.go?

marcello33 · 2026-03-23T16:15:54Z

core/vm/stack.go

 type Stack struct {
-	data []uint256.Int
+	top  int
+	data [1024]uint256.Int


Did we quantify the memory tradeoff here? I believe this change removes the append overhead, but also each stack could use more memory even for small contracts. Did we evaluate this during testing? Or maybe worth adding a benchmark test?

kamuikatsurgi added 24 commits February 22, 2026 17:45

perf: use fixed-size array stack to eliminate slice overhead in EVM

b8461b8

core/vm: add switch-dispatch fast path for EVM interpreter

f3efb30

Merge branch 'develop' into kamui/port-gevm-perf-optimisations

75c32d5

fix: keccak 256 benchmark test

35d2f01

chore: stack benchmark test

19e3496

core/vm: fix revert data loss, fork gate, and push allocs in switch d…

bf74f94

…ispatch

chore: nits

90d1191

Merge remote-tracking branch 'origin/develop' into kamui/port-gevm-pe…

11d7051

…rf-optimisations

Merge branch 'develop' into kamui/port-gevm-perf-optimisations

d2322c7

feat: snailtracer benchmark test

3af06f0

Merge branch 'develop' into kamui/port-gevm-perf-optimisations

b96ff26

chore: add EnableSwitchDispatch flag

d1f68d0

chore: add a log

b4ad429

Merge branch 'develop' into kamui/port-gevm-perf-optimisations

f6d405b

chore: fix tests

aab5a9a

Merge branch 'develop' into kamui/port-gevm-perf-optimisations

ecaba6b

Merge branch 'develop' into kamui/port-gevm-perf-optimisations

1ead527

Merge branch 'develop' into kamui/port-gevm-perf-optimisations

ddd3899

Merge branch 'develop' into kamui/port-gevm-perf-optimisations

92566e0

Merge branch 'develop' into kamui/port-gevm-perf-optimisations

cfabbea

chore: final nits

6428084

fix: use blockchain's vm.Config instead of nil

f8070e9

fix: tests and disable tracing for block producer

a022b70

fix: lint

7ea2c99

0xPolygon deleted a comment from claude bot Mar 17, 2026

kamuikatsurgi added 2 commits March 18, 2026 00:48

chore: comments

7d74bf4

chore: nits

00f3fae

claude bot reviewed Mar 18, 2026

View reviewed changes

chore: tests

fb4dc05

fix: lint

4df1ff3

claude bot reviewed Mar 18, 2026

View reviewed changes

adamdossa requested review from cffls and marcello33 March 23, 2026 10:13

cffls reviewed Mar 23, 2026

View reviewed changes

marcello33 approved these changes Mar 23, 2026

View reviewed changes

		@@ -0,0 +1,594 @@
		// Code generator for interpreter_dispatch.go — the EVM switch dispatch.
		// Adapted from GEVM's gen/main.go (github.com/Giulio2002/gevm).

Conversation

kamuikatsurgi commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

codecov bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

claude bot commented Mar 17, 2026

Code review

Uh oh!

adamdossa commented Mar 18, 2026

Uh oh!

claude bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review: GEVM Performance Optimisations

1. Behavior change in miner/worker.go — block production now inherits full VM config (Medium)

2. Memory footprint increase from fixed-size stack (Low — Informational)

3. returnStack doesn't zero the data array (Low — By Design)

4. Gas accumulator overflow analysis (Low — Verified Safe)

5. Code generator and generated code consistency (Nit)

6. SonarQube findings (Nit)

Summary

Uh oh!

claude bot Mar 18, 2026

Choose a reason for hiding this comment

What the bug is

How the Tracer gets set

Git history confirms this is a regression

Impact on the switch-dispatch fast path

Precedent in the codebase

Step-by-step proof

Fix

Uh oh!

kamuikatsurgi Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

kamuikatsurgi Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot commented Mar 18, 2026

Code review

Uh oh!

sonarqubecloud bot commented Mar 18, 2026

Quality Gate failed

Uh oh!

claude bot Mar 18, 2026

Choose a reason for hiding this comment

What the bug is

Why the overflow check is still correct

Step-by-step proof of the inconsistency

Impact assessment

Suggested fix

Uh oh!

claude bot commented Mar 19, 2026

Code review

Uh oh!

cffls left a comment

Choose a reason for hiding this comment

Uh oh!

cffls Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

cffls Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

marcello33 left a comment

Choose a reason for hiding this comment

Uh oh!

marcello33 Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

marcello33 Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

kamuikatsurgi commented Mar 17, 2026 •

edited

Loading

codecov bot commented Mar 17, 2026 •

edited

Loading

claude bot commented Mar 18, 2026 •

edited

Loading

1. Behavior change in `miner/worker.go` — block production now inherits full VM config (Medium)

3. `returnStack` doesn't zero the data array (Low — By Design)