Improving Trip Segmentation by reducing DB calls#956
Improving Trip Segmentation by reducing DB calls#956humbleOldSage wants to merge 2 commits intoe-mission:masterfrom
Conversation
The changes below that led to these performance upgrades are investigated in e-mission/e-mission-docs#1041 . They are : 1. db calls for transition and motion dataframes are moved upstream from `is_tracking_restarted_in_range` function and `get_ongoing_motion_in_range` in `restart_checking.py` to `trip_segmentaiton.py`. The old setting which had multiple db calls ( for each iteration ) now happen once in the improved setting. 2. All the other changes in `trip_segmentation.py` and `dwell_segmentation_dist_filter.py` are just to support the change in point 1 ( above). 3. in `dwell_segmentation_time_filter.py`,other than the changes to support point 1 ( above), there an additional improvement. The calculations for `last10PointsDistances` and `last5MinsPoints` are vectorised. For this, `calDistance` in `common.py` now supports numpy arrays.
I don't see any indication of the performance improvement related to this change in the issue. I would anticipate the change to be minimal, since Also, it would be helpful if you would link to the specific comment in the issue that documented the improvement, and even duplicate the numbers here to make it easier to look up... |
Its here e-mission/e-mission-docs#1041 (comment).
I have now added a separate comment for this.
Yes, it's just for 10 rows. But, since this part falls inside the loop, it's the overall loop time that is improved as mentioned.
Sure. |
|
Overall, Current AndorraWrapper runtime is ~1.6s , iOS Runtime is ~0.4s and CombinedWrapper runtime is ~2.1 ( as mentioned here e-mission/e-mission-docs#1041 (comment) ) |
|
@humbleOldSage there is a comment in the issue that indicates that there were some failures, but I don't see any updates on the fix. e-mission/e-mission-docs#1041 (comment) Also, what are you testing this on? Why do we have the magic number of 327? Please also separate the vectorization from the code that actually reduces DB calls so we can evaluate each of them independently |
This reverts commit d95b2a0.
|
Did , in a separate PR #958 . Closing this one since no updates expected here. |
The changes below that led to these performance upgrades are investigated in e-mission/e-mission-docs#1041 . They are :
db calls for transition and motion dataframes are moved upstream from
is_tracking_restarted_in_rangefunction andget_ongoing_motion_in_rangeinrestart_checking.pytotrip_segmentaiton.py. The old setting which had multiple db calls ( for each iteration ) now happen once in the improved setting.All the other changes in
trip_segmentation.pyanddwell_segmentation_dist_filter.pyare just to support the change in point 1 ( above).in
dwell_segmentation_time_filter.py,other than the changes to support point 1 ( above), there an additional improvement. The calculations forlast10PointsDistancesandlast5MinsPointsare vectorised. For this,calDistanceincommon.pynow supports numpy arrays.