I am observing significant differences in run time speed between two branches: feature/example3 and feature/numba. This is only occurring for OpenCL code. For example:
- Example 3 on the feature/numba branch runs at approximately 1450 steps/s on my local machine
- Example3 on the feature/example3 branch runs at approximately 1200 steps/s on my local machine
I am observing similar drops in performance for all examples. I utilised git diff to search for differences between the two branches and there is nothing obvious that would impact so significantly on performance. It would be useful if someone else could run these examples and see if they also observe the drop in performance.