Hello,
I can see that the recent commit to support V80 introduces 'block' HBM mode, which allows each HBM channel to be address directly. I believe this is the extension to support per Coyotev2 paper, 'For applications that require the full HBM bandwidth, it is possible to bypass the MMU and directly expose certain HBM channels.'
As of now, I can see that this feature is not supported on u55c (as well as any Ultrascale+ platforms), from
|
dbg_info("memory block specified, but UltraScale+ devices do not support block memory; ignoring...\n"); |
I wonder if there are any plan to bring block memory (explicting addressing per-channel data) support to u55c.
Meanwhile, we see a saturation of Read Throughput on Figure 7a of the paper to around 60 GB/s. Is it plausible to attempt a design with DEST_BITS 5 + N_AXI_CARD 31 to get closer to the documented u55c HBM2 bandwidth 460 GB/s?
Thank you!
Hello,
I can see that the recent commit to support V80 introduces 'block' HBM mode, which allows each HBM channel to be address directly. I believe this is the extension to support per Coyotev2 paper, 'For applications that require the full HBM bandwidth, it is possible to bypass the MMU and directly expose certain HBM channels.'
As of now, I can see that this feature is not supported on u55c (as well as any Ultrascale+ platforms), from
Coyote/driver/src/vfpga/vfpga_gup.c
Line 401 in cdf54c8
I wonder if there are any plan to bring block memory (explicting addressing per-channel data) support to u55c.
Meanwhile, we see a saturation of Read Throughput on Figure 7a of the paper to around 60 GB/s. Is it plausible to attempt a design with DEST_BITS 5 + N_AXI_CARD 31 to get closer to the documented u55c HBM2 bandwidth 460 GB/s?
Thank you!