Implement option to transpose neighbor lists of PrecomputedNeighborhoodSearch#137
Implement option to transpose neighbor lists of PrecomputedNeighborhoodSearch#137
PrecomputedNeighborhoodSearch#137Conversation
There was a problem hiding this comment.
Pull request overview
This PR implements an option to transpose the neighbor lists data structure in PrecomputedNeighborhoodSearch to optimize GPU performance through coalesced memory access patterns. When transpose_backend = true, the first neighbors of all points are stored contiguously in memory rather than all neighbors of each point, which can provide up to 3x speedup on GPUs.
- Added
transpose_backendparameter toPrecomputedNeighborhoodSearchand related functions - Implemented transposed memory layout using
PermutedDimsArraywrapper inDynamicVectorOfVectors - Added comprehensive tests to verify both regular and transposed backend behavior
- Updated documentation to explain when to use transposed backend (GPU vs CPU optimization)
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
test/unittest.jl |
Added inclusion of new test file for precomputed neighborhood search |
test/nhs_precomputed.jl |
New test file verifying both regular and transposed backend memory layouts |
src/vector_of_vectors.jl |
Extended DynamicVectorOfVectors constructor to support transpose option and added helper functions to check backend type |
src/nhs_precomputed.jl |
Added transpose_backend parameter with detailed documentation about CPU vs GPU optimization trade-offs |
src/nhs_grid.jl |
Added documentation note about search_radius type determining distance computation type |
src/nhs_trivial.jl |
Added documentation note about search_radius type determining distance computation type |
src/cell_lists/cell_lists.jl |
Updated construct_backend functions to accept and propagate transpose_backend parameter with validation |
src/PointNeighbors.jl |
Exported DynamicVectorOfVectors for public API access |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #137 +/- ##
==========================================
+ Coverage 84.75% 84.82% +0.07%
==========================================
Files 15 15
Lines 715 725 +10
==========================================
+ Hits 606 615 +9
- Misses 109 110 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This PR implements the "transposed NL" optimization mentioned in trixi-framework/TrixiParticles.jl#968. See this PR for benchmark results.