Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1044 +/- ##
==========================================
- Coverage 89.63% 88.73% -0.91%
==========================================
Files 121 122 +1
Lines 8763 8852 +89
==========================================
Hits 7855 7855
- Misses 908 997 +89
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| # Apply to all data | ||
| active_particle .= active_particle_sorted[perm2] | ||
| system_coords .= system_coords[:, combined_perm] | ||
| system_velocity .= system_velocity[:, combined_perm] | ||
| system_pressure .= system_pressure[combined_perm] | ||
| system_density .= system_density[combined_perm] |
There was a problem hiding this comment.
This doesn't make sense. It's system-specific (e.g. TLSPH doesn't have pressure). You need to pass the combined_perm to sort_system!.
There was a problem hiding this comment.
Okay, I just assumed that only fluid systems are sorted.
It doesn't make much sense to sort a TLSPH, right?
There was a problem hiding this comment.
Yes, but we might want to sort DEM (not a priority though IMO, so we could skip this if it makes things easier). But what about IISPH? I'll have to check again if we need to sort anything else there.
If this actually works for all fluid systems, I'd skip DEM.
| @@ -0,0 +1,165 @@ | |||
| struct SortingCallback{I} | |||
There was a problem hiding this comment.
| struct SortingCallback{I} | |
| # These are the systems that require sorting. | |
| # Boundary particles always stay fixed relative to each other, TLSPH computes in the initial configuration. | |
| const SortingSystem = Union{AbstractFluidSystem, DEMSystem} | |
| struct SortingCallback{I} |
There was a problem hiding this comment.
I don't like the name SortingSystem, but I don't have a better idea.
There was a problem hiding this comment.
hm how about DynamicNeighborhoodSystem? Or a name that indicates, that the trajectory of single particles change a lot during simulation.
| end | ||
|
|
||
| # TODO: Sort also masses and particle spacings for variable smoothing lengths. | ||
| function sort_particles!(system::AbstractFluidSystem, v, u, nhs, |
There was a problem hiding this comment.
| function sort_particles!(system::AbstractFluidSystem, v, u, nhs, | |
| function sort_particles!(system::SortingSystem, v, u, nhs, |
| cell_ids[particle] = PointNeighbors.cell_index(cell_list, | ||
| PointNeighbors.cell_coords(point_coords, | ||
| nhs)) |
There was a problem hiding this comment.
| cell_ids[particle] = PointNeighbors.cell_index(cell_list, | |
| PointNeighbors.cell_coords(point_coords, | |
| nhs)) | |
| cell_coords[particle] = PointNeighbors.cell_coords(point_coords, nhs) |
This works for all cell lists, and sorting by cell index doesn't make sense for anything other than the cell-based GridNeighborhoodSearch, so I think this is fine and we don't need any new API.
| # TODO: Sort also masses and particle spacings for variable smoothing lengths. | ||
| function sort_particles!(system::AbstractFluidSystem, v, u, nhs, | ||
| cell_list::FullGridCellList, semi) | ||
| cell_ids = zero(allocate(semi.parallelization_backend, Int, nparticles(system))) |
There was a problem hiding this comment.
| cell_ids = zero(allocate(semi.parallelization_backend, Int, nparticles(system))) | |
| cell_coords = allocate(semi.parallelization_backend, SVector{ndims(system), Int}, nparticles(system)) |
| nhs)) | ||
| end | ||
|
|
||
| perm = sortperm(transfer2cpu(cell_ids)) |
There was a problem hiding this comment.
| perm = sortperm(transfer2cpu(cell_ids)) | |
| perm = sortperm(transfer2cpu(cell_coords)) |
| nhs = get_neighborhood_search(system, semi) | ||
| cell_list = nhs.cell_list |
There was a problem hiding this comment.
| nhs = get_neighborhood_search(system, semi) | |
| cell_list = nhs.cell_list | |
| nhs = get_neighborhood_search(system, semi) | |
| if !(nhs isa GridNeighborhoodSearch) | |
| throw(ArgumentError("`SortingCallback` can only be used with a `GridNeighborhoodSearch`") | |
| end |
Or do we need this for the trivial NHS as well?

Background
I ran a simulation with approximately 1.8 million fluid particles.
After restarting the simulation at t = 5.0 s, I observed a severe performance regression, despite no code changes between the initial run and the restart.
For the same number of iterations, the wall time increased from ~4.4 h to ~20 h.

At first glance, this was surprising because:
To isolate the issue, I ran the simulation for 4 iterations starting from:
The performance regression:
t = 0.0s
t = 5.0
Benchmarking a single
interact!call showed a slowdown from roughly 10x withThis strongly suggested that the issue was not allocation-related, but rather due to memory access patterns.
After an extended debugging session, @efaulhaber suggested that the slowdown might be caused by cache misses.
Initially, we were skeptical, but to test this hypothesis I created this PR.
The figures below show the particle indexing at t=0 (left) and t=5 (right).
Results after sorting by cell ID for t=5
t=0 sorted
Note that there are different number of active particles and the location is also not exactly identical.
The following plot shows a 24-hour simulation run comparing the sorted (orange, sorted every 1000 step) and unsorted (blue) particle layouts. Due to buffered stdout, the unsorted run did not flush all output before the job was terminated, so the plotted data for this case is incomplete. The figure therefore illustrates only a trend. More detailed benchmarks should still be carried out.