portable_simd version of Avg (4bpp)#641
Conversation
|
AI disclosure: I wrote a original sliding-window |
|
There's another 4bpp case for the first row, where Lines 612 to 624 in f33b850 |
|
I hadn't tried it - wrote some quick code for it but it seems that the |
ac6b46c to
096960b
Compare
|
Also, if any contributors have access to some Intel hardware, could they give this |
|
Rebaselining to rustc/cargo 1.93.0-nightly (2a7c49606 2025-11-25):
Overall, I'd say it's still probably worth it for aarch64 systems, the A520 gain is particularly nice to have for low-end devices. |
096960b to
d221c8f
Compare
Again, Cortex-A520 seems the big winner here, going from 434 MiB/s to about 740 MiB/s (70% faster), X4 benefits less (about 13%).
d221c8f to
94039f0
Compare
Implements a RGBA version of the Avg filter with
portable_simdintrinsics.Marked as draft until #632 is completed.