Commit cb4a14f
[rocm-libraries] ROCm/rocm-libraries#5156 (commit 195bdc2)
[rocrand] Fix benchmark_rocrand_device_api launch parameters
(#5156)
## Motivation
<!-- Explain the purpose of this PR and the goals it aims to achieve.
-->
When running `benchmark_rocrand_device_api` on certain gpu
architectures, we may get a `hipErrorLaunchFailure` due to launch params
being larger than launch bounds. This PR fixes this issue.
## Technical Details
<!-- Explain the changes along with any relevant GitHub links. -->
In the code that determines an optimal block size for launching the
benchmark kernels, the kernel's `maxThreadsPerBlock` attribute was not
honored. This caused the determined number of threads to exceed
`maxThreadsPerBlock` on certain architectures.
## Test Plan
<!-- Explain any relevant testing done to verify this PR. -->
Build and run `benchmark_rocrand_device_api` on gfx1200 to confirm that
the launch params are now within launch bounds.
## Test Result
<!-- Briefly summarize test outcomes. -->
The test passes.
## Submission Checklist
- [ ] Look over the contributing guidelines at
https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
Co-authored-by: Song <hsong@ctr2-alola-ctrl-01.amd.com>1 parent f0c83d6 commit cb4a14f
1 file changed
+3
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
90 | 90 | | |
91 | 91 | | |
92 | 92 | | |
93 | | - | |
94 | | - | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
95 | 96 | | |
96 | | - | |
97 | 97 | | |
98 | 98 | | |
99 | 99 | | |
| |||
0 commit comments