Skip to content

Commit a231c5b

Browse files
committed
1 parent 1a5f227 commit a231c5b

File tree

60 files changed

+3790
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

60 files changed

+3790
-0
lines changed
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
2+
<style>
3+
body{font-family:system-ui,Arial,sans-serif;margin:2rem;max-width:80ch}
4+
table{border-collapse:collapse;margin:1rem 0}
5+
th,td{border:1px solid #bbb;padding:.3rem .6rem;text-align:right}
6+
th{text-align:center;background:#f0f0f0}
7+
tr:nth-child(even){background:#fafafa}
8+
details{border:1px solid #ccc;border-radius:.4rem;padding:.6rem}
9+
summary{font-weight:600;cursor:pointer}
10+
.err{border:2px solid #c00;background:#fee;padding:1rem;border-radius:.5rem}
11+
</style>
12+
<h1>Buddy-Benchmark results</h1><ul>
13+
14+
</ul>
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
2+
<style>
3+
body{font-family:system-ui,Arial,sans-serif;margin:2rem;max-width:80ch}
4+
table{border-collapse:collapse;margin:1rem 0}
5+
th,td{border:1px solid #bbb;padding:.3rem .6rem;text-align:right}
6+
th{text-align:center;background:#f0f0f0}
7+
tr:nth-child(even){background:#fafafa}
8+
details{border:1px solid #ccc;border-radius:.4rem;padding:.6rem}
9+
summary{font-weight:600;cursor:pointer}
10+
.err{border:2px solid #c00;background:#fee;padding:1rem;border-radius:.5rem}
11+
</style>
12+
13+
<h2>deeplearning/dl-layer-ffn-benchmark.json</h2><p><em>2025-07-27 17:54:34 UTC</em></p>
14+
<h3>dl-layer-ffn-benchmark.json</h3>
15+
<table><tr><th>Name</th><th>Time&nbsp;(ms)</th><th>CPU&nbsp;(ms)</th><th>Iterations</th></tr>
16+
<tr><td style='text-align:left'>DL_LAYER_FFN/Scalar</td><td>0.0654</td><td>0.0654</td><td>10,762</td></tr>
17+
<tr><td style='text-align:left'>DL_LAYER_FFN/Auto_Vectorization</td><td>0.0271</td><td>0.0271</td><td>25,673</td></tr></table>
18+
<details><summary>Console output</summary>
19+
<pre>2025-07-27T14:26:49+00:00
20+
Running ./dl-layer-ffn-benchmark
21+
Run on (24 X 5100 MHz CPU s)
22+
CPU Caches:
23+
L1 Data 48 KiB (x12)
24+
L1 Instruction 32 KiB (x12)
25+
L2 Unified 1280 KiB (x12)
26+
L3 Unified 30720 KiB (x1)
27+
Load Average: 1.04, 1.19, 1.31
28+
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
29+
--------------------------------------------------------------------------
30+
Benchmark Time CPU Iterations
31+
--------------------------------------------------------------------------
32+
DL_LAYER_FFN/Scalar 0.065 ms 0.065 ms 10762
33+
DL_LAYER_FFN/Auto_Vectorization 0.027 ms 0.027 ms 25673
34+
-----------------------------------------------------------
35+
Correctness Verification: PASS
36+
-----------------------------------------------------------
37+
</pre></details>
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
2+
<style>
3+
body{font-family:system-ui,Arial,sans-serif;margin:2rem;max-width:80ch}
4+
table{border-collapse:collapse;margin:1rem 0}
5+
th,td{border:1px solid #bbb;padding:.3rem .6rem;text-align:right}
6+
th{text-align:center;background:#f0f0f0}
7+
tr:nth-child(even){background:#fafafa}
8+
details{border:1px solid #ccc;border-radius:.4rem;padding:.6rem}
9+
summary{font-weight:600;cursor:pointer}
10+
.err{border:2px solid #c00;background:#fee;padding:1rem;border-radius:.5rem}
11+
</style>
12+
13+
<h2>deeplearning/dl-layer-rmsnorm-benchmark.json</h2><p><em>2025-07-27 17:54:34 UTC</em></p>
14+
<h3>dl-layer-rmsnorm-benchmark.json</h3>
15+
<table><tr><th>Name</th><th>Time&nbsp;(ms)</th><th>CPU&nbsp;(ms)</th><th>Iterations</th></tr>
16+
<tr><td style='text-align:left'>DL_LAYER_RMSNORM/Scalar</td><td>0.00196</td><td>0.00196</td><td>356,202</td></tr>
17+
<tr><td style='text-align:left'>DL_LAYER_RMSNORM/Auto_Vectorization</td><td>0.000915</td><td>0.000915</td><td>751,546</td></tr></table>
18+
<details><summary>Console output</summary>
19+
<pre>2025-07-27T14:26:53+00:00
20+
Running ./dl-layer-rmsnorm-benchmark
21+
Run on (24 X 5100 MHz CPU s)
22+
CPU Caches:
23+
L1 Data 48 KiB (x12)
24+
L1 Instruction 32 KiB (x12)
25+
L2 Unified 1280 KiB (x12)
26+
L3 Unified 30720 KiB (x1)
27+
Load Average: 1.03, 1.19, 1.30
28+
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
29+
------------------------------------------------------------------------------
30+
Benchmark Time CPU Iterations
31+
------------------------------------------------------------------------------
32+
DL_LAYER_RMSNORM/Scalar 0.002 ms 0.002 ms 356202
33+
DL_LAYER_RMSNORM/Auto_Vectorization 0.001 ms 0.001 ms 751546
34+
-----------------------------------------------------------
35+
Correctness Verification: PASS
36+
-----------------------------------------------------------
37+
</pre></details>
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
2+
<style>
3+
body{font-family:system-ui,Arial,sans-serif;margin:2rem;max-width:80ch}
4+
table{border-collapse:collapse;margin:1rem 0}
5+
th,td{border:1px solid #bbb;padding:.3rem .6rem;text-align:right}
6+
th{text-align:center;background:#f0f0f0}
7+
tr:nth-child(even){background:#fafafa}
8+
details{border:1px solid #ccc;border-radius:.4rem;padding:.6rem}
9+
summary{font-weight:600;cursor:pointer}
10+
.err{border:2px solid #c00;background:#fee;padding:1rem;border-radius:.5rem}
11+
</style>
12+
13+
<h2>deeplearning/dl-layer-selfattention-benchmark.json</h2><p><em>2025-07-27 17:54:34 UTC</em></p>
14+
<h3>dl-layer-selfattention-benchmark.json</h3>
15+
<table><tr><th>Name</th><th>Time&nbsp;(ms)</th><th>CPU&nbsp;(ms)</th><th>Iterations</th></tr>
16+
<tr><td style='text-align:left'>DL_LAYER_ATTENTION/Scalar</td><td>4.69</td><td>4.69</td><td>149</td></tr>
17+
<tr><td style='text-align:left'>DL_LAYER_ATTENTION/Auto_Vectorization</td><td>1.57</td><td>1.57</td><td>446</td></tr></table>
18+
<details><summary>Console output</summary>
19+
<pre>2025-07-27T14:26:51+00:00
20+
Running ./dl-layer-selfattention-benchmark
21+
Run on (24 X 5100 MHz CPU s)
22+
CPU Caches:
23+
L1 Data 48 KiB (x12)
24+
L1 Instruction 32 KiB (x12)
25+
L2 Unified 1280 KiB (x12)
26+
L3 Unified 30720 KiB (x1)
27+
Load Average: 1.04, 1.19, 1.31
28+
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
29+
--------------------------------------------------------------------------------
30+
Benchmark Time CPU Iterations
31+
--------------------------------------------------------------------------------
32+
DL_LAYER_ATTENTION/Scalar 4.69 ms 4.69 ms 149
33+
DL_LAYER_ATTENTION/Auto_Vectorization 1.57 ms 1.57 ms 446
34+
-----------------------------------------------------------
35+
Correctness Verification: PASS
36+
-----------------------------------------------------------
37+
</pre></details>
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
2+
<style>
3+
body{font-family:system-ui,Arial,sans-serif;margin:2rem;max-width:80ch}
4+
table{border-collapse:collapse;margin:1rem 0}
5+
th,td{border:1px solid #bbb;padding:.3rem .6rem;text-align:right}
6+
th{text-align:center;background:#f0f0f0}
7+
tr:nth-child(even){background:#fafafa}
8+
details{border:1px solid #ccc;border-radius:.4rem;padding:.6rem}
9+
summary{font-weight:600;cursor:pointer}
10+
.err{border:2px solid #c00;background:#fee;padding:1rem;border-radius:.5rem}
11+
</style>
12+
13+
<h2>deeplearning/dl-model-lenet-benchmark.json</h2><p><em>2025-07-27 17:54:34 UTC</em></p>
14+
<h3>dl-model-lenet-benchmark.json</h3>
15+
<table><tr><th>Name</th><th>Time&nbsp;(ms)</th><th>CPU&nbsp;(ms)</th><th>Iterations</th></tr>
16+
<tr><td style='text-align:left'>DL_MODEL_LENET/Auto_Vectorization</td><td>0.165</td><td>0.165</td><td>4,304</td></tr>
17+
<tr><td style='text-align:left'>DL_MODEL_LENET/Buddy_Vectorization</td><td>0.137</td><td>0.137</td><td>5,022</td></tr></table>
18+
<details><summary>Console output</summary>
19+
<pre>2025-07-27T14:22:52+00:00
20+
Running ./dl-model-lenet-benchmark
21+
Run on (24 X 5100 MHz CPU s)
22+
CPU Caches:
23+
L1 Data 48 KiB (x12)
24+
L1 Instruction 32 KiB (x12)
25+
L2 Unified 1280 KiB (x12)
26+
L3 Unified 30720 KiB (x1)
27+
Load Average: 1.40, 1.39, 1.40
28+
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
29+
-----------------------------------------------------------------------------
30+
Benchmark Time CPU Iterations
31+
-----------------------------------------------------------------------------
32+
DL_MODEL_LENET/Auto_Vectorization 0.165 ms 0.165 ms 4304
33+
DL_MODEL_LENET/Buddy_Vectorization 0.137 ms 0.137 ms 5022
34+
-----------------------------------------------------------
35+
Correctness Verification:
36+
Transform case: FAIL
37+
-----------------------------------------------------------
38+
</pre></details>
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
2+
<style>
3+
body{font-family:system-ui,Arial,sans-serif;margin:2rem;max-width:80ch}
4+
table{border-collapse:collapse;margin:1rem 0}
5+
th,td{border:1px solid #bbb;padding:.3rem .6rem;text-align:right}
6+
th{text-align:center;background:#f0f0f0}
7+
tr:nth-child(even){background:#fafafa}
8+
details{border:1px solid #ccc;border-radius:.4rem;padding:.6rem}
9+
summary{font-weight:600;cursor:pointer}
10+
.err{border:2px solid #c00;background:#fee;padding:1rem;border-radius:.5rem}
11+
</style>
12+
13+
<h2>deeplearning/dl-model-mobilenetv3-benchmark.json</h2><p><em>2025-07-27 17:54:34 UTC</em></p>
14+
<h3>dl-model-mobilenetv3-benchmark.json</h3>
15+
<table><tr><th>Name</th><th>Time&nbsp;(ms)</th><th>CPU&nbsp;(ms)</th><th>Iterations</th></tr>
16+
<tr><td style='text-align:left'>BM_MobileNet_V3/BM_MobileNet_V3_scalar</td><td>37.1</td><td>37.1</td><td>19</td></tr>
17+
<tr><td style='text-align:left'>BM_MobileNet_V3/BM_MobileNet_V3_conv_opt</td><td>33</td><td>33</td><td>21</td></tr></table>
18+
<details><summary>Console output</summary>
19+
<pre>2025-07-27T14:22:49+00:00
20+
Running ./dl-model-mobilenetv3-benchmark
21+
Run on (24 X 5100 MHz CPU s)
22+
CPU Caches:
23+
L1 Data 48 KiB (x12)
24+
L1 Instruction 32 KiB (x12)
25+
L2 Unified 1280 KiB (x12)
26+
L3 Unified 30720 KiB (x1)
27+
Load Average: 1.40, 1.39, 1.40
28+
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
29+
-----------------------------------------------------------------------------------
30+
Benchmark Time CPU Iterations
31+
-----------------------------------------------------------------------------------
32+
BM_MobileNet_V3/BM_MobileNet_V3_scalar 37.1 ms 37.1 ms 19
33+
BM_MobileNet_V3/BM_MobileNet_V3_conv_opt 33.0 ms 33.0 ms 21
34+
-----------------------------------------------------------
35+
Correctness Verification:
36+
Transform case: PASS
37+
-----------------------------------------------------------
38+
</pre></details>
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
2+
<style>
3+
body{font-family:system-ui,Arial,sans-serif;margin:2rem;max-width:80ch}
4+
table{border-collapse:collapse;margin:1rem 0}
5+
th,td{border:1px solid #bbb;padding:.3rem .6rem;text-align:right}
6+
th{text-align:center;background:#f0f0f0}
7+
tr:nth-child(even){background:#fafafa}
8+
details{border:1px solid #ccc;border-radius:.4rem;padding:.6rem}
9+
summary{font-weight:600;cursor:pointer}
10+
.err{border:2px solid #c00;background:#fee;padding:1rem;border-radius:.5rem}
11+
</style>
12+
13+
<h2>deeplearning/dl-model-resnet18-benchmark.json</h2><p><em>2025-07-27 17:54:34 UTC</em></p>
14+
<h3>dl-model-resnet18-benchmark.json</h3>
15+
<table><tr><th>Name</th><th>Time&nbsp;(ms)</th><th>CPU&nbsp;(ms)</th><th>Iterations</th></tr>
16+
<tr><td style='text-align:left'>DL_MODEL_Resnet18/Auto_Vectorization</td><td>731</td><td>723</td><td>1</td></tr>
17+
<tr><td style='text-align:left'>DL_MODEL_Resnet18/Buddy_Vectorization</td><td>729</td><td>722</td><td>1</td></tr></table>
18+
<details><summary>Console output</summary>
19+
<pre>2025-07-27T14:26:46+00:00
20+
Running ./dl-model-resnet18-benchmark
21+
Run on (24 X 5100 MHz CPU s)
22+
CPU Caches:
23+
L1 Data 48 KiB (x12)
24+
L1 Instruction 32 KiB (x12)
25+
L2 Unified 1280 KiB (x12)
26+
L3 Unified 30720 KiB (x1)
27+
Load Average: 1.04, 1.19, 1.31
28+
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
29+
--------------------------------------------------------------------------------
30+
Benchmark Time CPU Iterations
31+
--------------------------------------------------------------------------------
32+
DL_MODEL_Resnet18/Auto_Vectorization 731 ms 723 ms 1
33+
DL_MODEL_Resnet18/Buddy_Vectorization 729 ms 722 ms 1
34+
-----------------------------------------------------------
35+
Correctness Verification: PASS
36+
-----------------------------------------------------------
37+
</pre></details>
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
2+
<style>
3+
body{font-family:system-ui,Arial,sans-serif;margin:2rem;max-width:80ch}
4+
table{border-collapse:collapse;margin:1rem 0}
5+
th,td{border:1px solid #bbb;padding:.3rem .6rem;text-align:right}
6+
th{text-align:center;background:#f0f0f0}
7+
tr:nth-child(even){background:#fafafa}
8+
details{border:1px solid #ccc;border-radius:.4rem;padding:.6rem}
9+
summary{font-weight:600;cursor:pointer}
10+
.err{border:2px solid #c00;background:#fee;padding:1rem;border-radius:.5rem}
11+
</style>
12+
13+
<h2>deeplearning/dl-model-tinyllama-benchmark.json</h2><p><em>2025-07-27 17:54:34 UTC</em></p>
14+
<h3>dl-model-tinyllama-benchmark.json</h3>
15+
<table><tr><th>Name</th><th>Time&nbsp;(ms)</th><th>CPU&nbsp;(ms)</th><th>Iterations</th></tr>
16+
<tr><td style='text-align:left'>DL_MODEL_TINYLLAMA/scalar</td><td>1.39e+05</td><td>1.39e+05</td><td>1</td></tr>
17+
<tr><td style='text-align:left'>DL_MODEL_TINYLLAMA/matmul_opt</td><td>1e+04</td><td>1e+04</td><td>1</td></tr>
18+
<tr><td style='text-align:left'>DL_MODEL_TINYLLAMA/matmul_opt_omp</td><td>7.84e+03</td><td>7.2e+03</td><td>1</td></tr></table>
19+
<details><summary>Console output</summary>
20+
<pre>2025-07-27T14:17:33+00:00
21+
Running ./dl-model-tinyllama-benchmark
22+
Run on (24 X 5100 MHz CPU s)
23+
CPU Caches:
24+
L1 Data 48 KiB (x12)
25+
L1 Instruction 32 KiB (x12)
26+
L2 Unified 1280 KiB (x12)
27+
L3 Unified 30720 KiB (x1)
28+
Load Average: 1.70, 1.92, 1.54
29+
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
30+
----------------------------------------------------------------------------
31+
Benchmark Time CPU Iterations
32+
----------------------------------------------------------------------------
33+
DL_MODEL_TINYLLAMA/scalar 139185 ms 139179 ms 1
34+
DL_MODEL_TINYLLAMA/matmul_opt 10038 ms 10038 ms 1
35+
DL_MODEL_TINYLLAMA/matmul_opt_omp 7836 ms 7201 ms 1
36+
---------- Verification ----------
37+
matmul_opt PASS
38+
matmul_opt_omp PASS
39+
</pre></details>
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
2+
<style>
3+
body{font-family:system-ui,Arial,sans-serif;margin:2rem;max-width:80ch}
4+
table{border-collapse:collapse;margin:1rem 0}
5+
th,td{border:1px solid #bbb;padding:.3rem .6rem;text-align:right}
6+
th{text-align:center;background:#f0f0f0}
7+
tr:nth-child(even){background:#fafafa}
8+
details{border:1px solid #ccc;border-radius:.4rem;padding:.6rem}
9+
summary{font-weight:600;cursor:pointer}
10+
.err{border:2px solid #c00;background:#fee;padding:1rem;border-radius:.5rem}
11+
</style>
12+
13+
<h2>deeplearning/dl-model-whisper-benchmark.json</h2><p><em>2025-07-27 17:54:34 UTC</em></p>
14+
<h3>dl-model-whisper-benchmark.json</h3>
15+
<table><tr><th>Name</th><th>Time&nbsp;(ms)</th><th>CPU&nbsp;(ms)</th><th>Iterations</th></tr>
16+
<tr><td style='text-align:left'>DL_MODEL_Whisper/Auto_Vectorization</td><td>8e+04</td><td>8e+04</td><td>1</td></tr>
17+
<tr><td style='text-align:left'>DL_MODEL_Whisper/Buddy_Vectorization</td><td>3.67e+04</td><td>3.67e+04</td><td>1</td></tr></table>
18+
<details><summary>Console output</summary>
19+
<pre>2025-07-27T14:22:54+00:00
20+
Running ./dl-model-whisper-benchmark
21+
Run on (24 X 5100 MHz CPU s)
22+
CPU Caches:
23+
L1 Data 48 KiB (x12)
24+
L1 Instruction 32 KiB (x12)
25+
L2 Unified 1280 KiB (x12)
26+
L3 Unified 30720 KiB (x1)
27+
Load Average: 1.45, 1.40, 1.40
28+
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
29+
-------------------------------------------------------------------------------
30+
Benchmark Time CPU Iterations
31+
-------------------------------------------------------------------------------
32+
DL_MODEL_Whisper/Auto_Vectorization 79983 ms 79980 ms 1
33+
DL_MODEL_Whisper/Buddy_Vectorization 36713 ms 36700 ms 1
34+
-----------------------------------------------------------
35+
Correctness Verification for Output1: PASS
36+
Correctness Verification for Output2: FAIL
37+
-----------------------------------------------------------
38+
</pre></details>
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
2+
<style>
3+
body{font-family:system-ui,Arial,sans-serif;margin:2rem;max-width:80ch}
4+
table{border-collapse:collapse;margin:1rem 0}
5+
th,td{border:1px solid #bbb;padding:.3rem .6rem;text-align:right}
6+
th{text-align:center;background:#f0f0f0}
7+
tr:nth-child(even){background:#fafafa}
8+
details{border:1px solid #ccc;border-radius:.4rem;padding:.6rem}
9+
summary{font-weight:600;cursor:pointer}
10+
.err{border:2px solid #c00;background:#fee;padding:1rem;border-radius:.5rem}
11+
</style>
12+
13+
<h2>deeplearning/dl-op-linalg-arithaddf-benchmark.json</h2><p><em>2025-07-27 17:54:34 UTC</em></p>
14+
<h3>dl-op-linalg-arithaddf-benchmark.json</h3>
15+
<table><tr><th>Name</th><th>Time&nbsp;(ms)</th><th>CPU&nbsp;(ms)</th><th>Iterations</th></tr>
16+
<tr><td style='text-align:left'>BM_ADDF_SCALAR</td><td>0.0295</td><td>0.0295</td><td>23,451</td></tr>
17+
<tr><td style='text-align:left'>BM_ADDF_AutoVectorization</td><td>0.004</td><td>0.004</td><td>174,931</td></tr></table>
18+
<details><summary>Console output</summary>
19+
<pre>2025-07-27T14:27:23+00:00
20+
Running ./dl-op-linalg-arithaddf-benchmark
21+
Run on (24 X 5100 MHz CPU s)
22+
CPU Caches:
23+
L1 Data 48 KiB (x12)
24+
L1 Instruction 32 KiB (x12)
25+
L2 Unified 1280 KiB (x12)
26+
L3 Unified 30720 KiB (x1)
27+
Load Average: 1.07, 1.18, 1.30
28+
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
29+
--------------------------------------------------------------------
30+
Benchmark Time CPU Iterations
31+
--------------------------------------------------------------------
32+
BM_ADDF_SCALAR 0.030 ms 0.030 ms 23451
33+
BM_ADDF_AutoVectorization 0.004 ms 0.004 ms 174931
34+
-----------------------------------------------------------
35+
Correctness Verification:
36+
Transform case: PASS
37+
-----------------------------------------------------------
38+
</pre></details>

0 commit comments

Comments
 (0)