Hi,
I've parallelized most of the loops used in the algorithm using rayon here: AdrianEddy@cda5e3e
Ideally this could be merged, but in my opinion it should be an optional feature, and this complicates things a bit to make the code clean for both the single-threaded path and the multi-threaded one, especially with the generic and the Send + Sync bounds
Unfortunately I don't have a capacity right now to work on a proper PR with the feature flag handled, and I also didn't ran extensive tests to verify everything, so I'm just leaving this here to let people know that a faster version is possible
Please feel free to take that code and implement in this repo in whatever form you seem fit.
Thanks for the great implementation!
Hi,
I've parallelized most of the loops used in the algorithm using rayon here: AdrianEddy@cda5e3e
Ideally this could be merged, but in my opinion it should be an optional feature, and this complicates things a bit to make the code clean for both the single-threaded path and the multi-threaded one, especially with the generic and the Send + Sync bounds
Unfortunately I don't have a capacity right now to work on a proper PR with the feature flag handled, and I also didn't ran extensive tests to verify everything, so I'm just leaving this here to let people know that a faster version is possible
Please feel free to take that code and implement in this repo in whatever form you seem fit.
Thanks for the great implementation!