Commit 44cb7be
authored
Inline quintic extension (#197)
* perf(quintic): force-inline quintic_mul + packed Mul impls
LLVM was not force-inlining quintic_mul despite #[inline] — the monomorphized
body is large enough that LLVM's cost heuristic declined. Each call-site paid
~5 cycles of function-call overhead. With quintic_mul called millions of times
per proof, this accumulated to ~2.4% of total runtime.
Zen 4 (c7a.2xlarge): -2.38% on xmss_leaf_1400sigs, p=0.0, revert-A/B confirmed.
* perf(quintic): force-inline quintic_square, quintic_mul_packed, MulAssign
Extends the previous commit's inlining pattern to additional multiplication-
related functions: quintic_square, all platform-specific quintic_mul_packed
variants (AVX-512/AVX2/NEON/fallback), and MulAssign<Self>/MulAssign<QEF>.
Testing established the I-cache budget boundary for forced inlining on Zen 4:
these 9 functions are the optimal set. Inlining more (e.g. Add/Sub/Neg) causes
regression from expanded code size.
Zen 4 (c7a.2xlarge): additional -1.25% on xmss_leaf_1400sigs, p=0.0,
revert-A/B confirmed. Combined with previous commit: ~-3.6% total.1 parent e5cd331 commit 44cb7be
File tree
3 files changed
+10
-10
lines changed- crates/backend/koala-bear/src/quintic_extension
3 files changed
+10
-10
lines changedLines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
527 | 527 | | |
528 | 528 | | |
529 | 529 | | |
530 | | - | |
| 530 | + | |
531 | 531 | | |
532 | 532 | | |
533 | 533 | | |
| |||
546 | 546 | | |
547 | 547 | | |
548 | 548 | | |
549 | | - | |
| 549 | + | |
550 | 550 | | |
551 | 551 | | |
552 | 552 | | |
| |||
Lines changed: 4 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
461 | 461 | | |
462 | 462 | | |
463 | 463 | | |
464 | | - | |
| 464 | + | |
465 | 465 | | |
466 | 466 | | |
467 | 467 | | |
| |||
476 | 476 | | |
477 | 477 | | |
478 | 478 | | |
479 | | - | |
| 479 | + | |
480 | 480 | | |
481 | 481 | | |
482 | 482 | | |
| |||
516 | 516 | | |
517 | 517 | | |
518 | 518 | | |
519 | | - | |
| 519 | + | |
520 | 520 | | |
521 | 521 | | |
522 | 522 | | |
| |||
527 | 527 | | |
528 | 528 | | |
529 | 529 | | |
530 | | - | |
| 530 | + | |
531 | 531 | | |
532 | 532 | | |
533 | 533 | | |
| |||
Lines changed: 4 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
| 11 | + | |
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
| 19 | + | |
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| |||
73 | 73 | | |
74 | 74 | | |
75 | 75 | | |
76 | | - | |
| 76 | + | |
77 | 77 | | |
78 | 78 | | |
79 | 79 | | |
| |||
157 | 157 | | |
158 | 158 | | |
159 | 159 | | |
160 | | - | |
| 160 | + | |
161 | 161 | | |
162 | 162 | | |
163 | 163 | | |
| |||
0 commit comments