Use external hip and hipcub headers when HOOMD_GPU_PLATFORM=CUDA#2178
Use external hip and hipcub headers when HOOMD_GPU_PLATFORM=CUDA#2178
Conversation
Builds fail with 7.1 on CUDA 12.
Also document Pixi installation instructions.
|
@mphoward, what do you think of using an external HIP library even when Ideally we could avoid this by using In any case, I won't merge this until |
|
Hmm, it isn't ideal to require CUDA users to download and compile HIP, but I agree this fix is substantially less work than getting Do you have a rough sense of when |
I do not. When working on this branch, I was surprised to find that I am concerned that HIP/CUDA interoperation is no longer actively supported by AMD. They seem to measure success by "can it run PyTorch?" and have even made the effort to create a custom build system that builds ROCm, HIP, and PyTorch: https://github.com/ROCm/TheRock |
I agree, this is all quite concerning. It might also explain why they have more closely matched the HIP kernel launch / device code interface with CUDA in recent major releases (it is a replacement, not a compatibility layer). Lack of proper CUDA support would be a good reason to put in the effort to update and use hipper as our own compatibility layer. I can prioritize working on that, but the soonest I can have a look would be in ~2 weeks (after the semester ends). It will require breaking changes to hipper first, and I am good with jumping to whatever the minimum versions of CUDA and ROCm that we need for HOOMD to keep the work as limited as possible. |
|
We'll give AMD some time and see if they add CUDA 13 support. If you do plan an eventual hipper refactor, the minimum CUDA I need is 12.8. NCSA Delta has just updated to that version. When I submitted this PR, the system was still on CUDA 12.4. |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
|
This pull request has been automatically closed because it has not had recent activity. |
|
No new commits on |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
Description
find_package(hip)to find external headers.Motivation and context
CUDA 13 contains many breaking changes and the vendored headers do not support it.
By using external hip libraries, HOOMD-blue will gain support for new versions of CUDA as soon as upstream adds support (at this time the latest release of hipcub does not support CUDA 13).
conda-forge does not support
HIP_PLATFORM=nvidia(conda-forge/hip-feedstock#9) inhip-develand lacks ahipcubpackage entirely. Therefore, users that build HOOMD from source for NVIDIA GPUs will need to install hip and hipcub headers:How has this been tested?
HOOMD-blue compiles and passes tests with CUDA 12.9, rocm-systems:hip-version_7.2.53220, and rocm-libraries:rocm-7.1.0 locally. CI checks have been updated accordingly. Patches to hip and hipcub fix build errors with CUDA 12.5–12.8.
Checklist:
sphinx-doc/credits.rst) in the pull request source branch.CHANGELOG.rstfollowing the established format.