Skip to content

Add native CUDA kernel for hydraulic erosion (#961)#967

Merged
brendancol merged 3 commits intomasterfrom
issue-961
Mar 4, 2026
Merged

Add native CUDA kernel for hydraulic erosion (#961)#967
brendancol merged 3 commits intomasterfrom
issue-961

Conversation

@brendancol
Copy link
Contributor

Closes #961

Summary

  • Adds _erode_gpu_kernel (@cuda.jit), one CUDA thread per particle, with cuda.atomic.add for heightmap conflicts
  • Replaces the CPU-fallback path for CuPy and Dask+CuPy arrays with on-device execution
  • Refactors erode() to use ArrayTypeFunctionMapping dispatch instead of manual type checks
  • Adds test_erosion.py (15 tests across numpy, dask+numpy, cupy, dask+cupy)
  • Adds user guide notebook 21_Hydraulic_Erosion.ipynb
  • Updates README feature matrix from fallback to native for GPU columns

Notes

  • GPU results are non-deterministic because cuda.atomic.add ordering depends on thread scheduling. GPU tests use correlation checks instead of exact comparison.
  • Float64 arithmetic in the GPU kernel to match the CPU path.
  • Erosion is a global operation (particles traverse the full grid), so dask arrays are still materialized before processing. They just run on-device now instead of round-tripping through NumPy.

Test plan

  • 15 new tests in test_erosion.py, all passing
  • Verify GPU tests pass on CI with CUDA
  • Run user guide notebook end-to-end to check plots render

Replaces the CPU-fallback path for CuPy and Dask+CuPy arrays with a
real GPU kernel. Each particle maps to one CUDA thread; conflicts at
shared heightmap cells are resolved with cuda.atomic.add.

Refactors erode() to use ArrayTypeFunctionMapping dispatch instead of
manual type checks.

Also adds test_erosion.py with 15 tests covering numpy, dask+numpy,
cupy, and dask+cupy backends.
@github-actions github-actions bot added the performance PR touches performance-sensitive code label Mar 4, 2026
@brendancol brendancol merged commit f08cb0f into master Mar 4, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add native CUDA kernel for hydraulic erosion

1 participant