Reduce Timing Measurement Scope to Bottom 80% of Tracked Functions #996

TeachMeTW · 2024-11-23T18:03:33Z

Summary

Focuses on reducing the scope of timing measurements to only the bottom 80% of tracked functions. The changes remove measurement for functions contributing less significantly to overall execution time.

Changes

Functions removed from timing measurement (bottom 80%):
- TRIP_SEGMENTATION/create_dist_filter
- TRIP_SEGMENTATION/create_time_filter
- TRIP_SEGMENTATION/get_data_df
- TRIP_SEGMENTATION/get_filters_in_df
- TRIP_SEGMENTATION/get_time_range_for_segmentation
- TRIP_SEGMENTATION/get_time_series
- TRIP_SEGMENTATION/handle_out_of_order_points
- TRIP_SEGMENTATION/segment_into_trips_dist/check_transitions_post_loop
- TRIP_SEGMENTATION/segment_into_trips_dist/continue_just_ended
- TRIP_SEGMENTATION/segment_into_trips_dist/get_transition_df
- TRIP_SEGMENTATION/segment_into_trips_dist/mark_valid
- TRIP_SEGMENTATION/segment_into_trips_dist/post_loop
- TRIP_SEGMENTATION/segment_into_trips_dist/set_new_trip_start_point
Retained function timings for:
- ACCURACY_FILTERING
- CLEAN_RESAMPLING
- CREATE_COMPOSITE_OBJECTS
- CREATE_CONFIRMED_OBJECTS
- EXPECTATION_POPULATION
- JUMP_SMOOTHING
- LABEL_INFERENCE
- MODE_INFERENCE
- STORE_USER_STATS
- USER_INPUT_MATCH_INCOMING
- TRIP_SEGMENTATION/segment_into_trips_dist/get_filtered_points_df

Context

Currently, only the dist_filter function was triggered in the staging dataset. I'll test locally to determine if the time_filter function can be triggered in additional scenarios.

The focus of this PR is exclusively on functions in the bottom 80% of tracked execution times. An exploration of the top 20% will follow in a subsequent PR.

Testing Plan

Conduct local testing to confirm:
- Functionality is unaffected by the removal of timing measurements.
- Triggers for time_filter and other functions behave as expected in local and staging environments.

Next Steps

Monitor performance in staging with reduced scope.
Prepare a follow-up PR to address timing measurements for the top 20% of tracked functions.

- Removed timing for less significant functions in TRIP_SEGMENTATION pipeline - Focused retained measurements on key contributors - Prepare for local testing to validate `create_time_filter` triggering - Will explore top 20% timing optimizations in a follow-up commit

- Removed tracking for additional functions identified as low-impact during iOS and Android local testing. - Retained tracking for key contributors to overall execution time. - Suggested broader retesting on staging, production, or different datasets to validate changes.

TeachMeTW · 2024-11-23T18:36:15Z

Follow-up: Refine Timing Measurement for Additional Functions

Summary

Performed additional local testing with both iOS and Android users to refine timing measurement further. Identified and removed more unneeded tracking functions. Suggest retesting on staging, production, or a different dataset to validate the changes more broadly.

Changes

Functions removed from timing measurement:
- TRIP_SEGMENTATION/segment_into_trips_dist/continue_just_ended
- TRIP_SEGMENTATION/segment_into_trips_dist/get_last_trip_end_point
- TRIP_SEGMENTATION/segment_into_trips_dist/handle_trip_end
- TRIP_SEGMENTATION/segment_into_trips_time/filter_bogus_points
- TRIP_SEGMENTATION/segment_into_trips_time/get_filtered_points_pre_ts_diff_df
- TRIP_SEGMENTATION/segment_into_trips_time/get_transition_df
- TRIP_SEGMENTATION/segment_into_trips_time/post_loop
Retained function timings for:
- ACCURACY_FILTERING
- CREATE_COMPOSITE_OBJECTS
- CREATE_CONFIRMED_OBJECTS
- EXPECTATION_POPULATION
- JUMP_SMOOTHING
- LABEL_INFERENCE
- SECTION_SEGMENTATION
- STORE_USER_STATS
- TRIP_SEGMENTATION/create_places_and_trips
- TRIP_SEGMENTATION/segment_into_trips_dist/get_filtered_points_df
- TRIP_SEGMENTATION/segment_into_trips_dist/has_trip_ended
- TRIP_SEGMENTATION/segment_into_trips_dist/loop
- TRIP_SEGMENTATION/segment_into_trips_time/calculations_per_iteration
- USERCACHE
- USER_INPUT_MATCH_INCOMING

Context

Local tests with iOS and Android users confirmed several low-significance functions that no longer require tracking.
Suggested retesting on staging, production, or with a different dataset to ensure these changes generalize across environments and data.

Next Steps

Deploy to staging or production for broader testing.
Collect additional feedback and refine tracking scope as needed.
Continue optimization for top contributing functions in subsequent iterations.

TeachMeTW · 2024-11-23T19:37:14Z

Data Name	Data Reading
TRIP_SEGMENTATION	72.293984
TRIP_SEGMENTATION/segment_into_trips	60.819502
TRIP_SEGMENTATION/segment_into_trips_time/loop	51.063456
MODE_INFERENCE	47.423461
TRIP_SEGMENTATION/segment_into_trips_time/has_trip_ended	24.049044
CLEAN_RESAMPLING	21.492011
TRIP_SEGMENTATION/segment_into_trips_time/calculations_per_iteration	16.452914
SECTION_SEGMENTATION	11.086053
CREATE_CONFIRMED_OBJECTS	8.868512
TRIP_SEGMENTATION/segment_into_trips_dist/loop	8.255366
TRIP_SEGMENTATION/create_places_and_trips	5.120428
TRIP_SEGMENTATION/segment_into_trips_dist/has_trip_ended	4.809482
JUMP_SMOOTHING	4.493899
CREATE_COMPOSITE_OBJECTS	2.728828
USER_INPUT_MATCH_INCOMING	0.750176
TRIP_SEGMENTATION/segment_into_trips_time/get_transition_df	0.713448
USERCACHE	0.635678
TRIP_SEGMENTATION/segment_into_trips_time/get_filtered_points_pre_ts_diff_df	0.444665
LABEL_INFERENCE	0.226110
TRIP_SEGMENTATION/segment_into_trips_dist/get_filtered_points_df	0.217923
EXPECTATION_POPULATION	0.081470
ACCURACY_FILTERING	0.010735
TRIP_SEGMENTATION/segment_into_trips_dist/get_last_trip_end_point	0.014056
TRIP_SEGMENTATION/segment_into_trips_dist/handle_trip_end	0.029570
STORE_USER_STATS	0.005631
TRIP_SEGMENTATION/segment_into_trips_dist/continue_just_ended	0.004914
TRIP_SEGMENTATION/segment_into_trips_time/filter_bogus_points	0.000875
TRIP_SEGMENTATION/segment_into_trips_time/post_loop	0.000045

Insights:

Loops Dominate Time Usage:
- The entries related to loops (segment_into_trips_time/loop and segment_into_trips_dist/loop) have some of the highest readings.
- This is expected as loops perform repeated operations, and smaller time increments compound into significant overall time.
High Time Usage Outside Loops:
- segment_into_trips_time/has_trip_ended is a notable contributor to time usage despite operating outside the loop.
- It may need instrumentation to analyze and optimize its operations as it takes significant time for a non-loop operation.
Opportunities for Optimization:
- Loops could be optimized further by reducing unnecessary operations or improving data structures to minimize iteration overhead.
- Instrumentation for has_trip_ended might reveal redundant calculations or inefficiencies.
Smaller Contributors:
- While smaller contributors like get_filtered_points_df and get_transition_df take less time, their cumulative impact should be reviewed in the broader context of system efficiency. Their values differ based on the dataset.

TeachMeTW added 2 commits November 23, 2024 10:01

TeachMeTW closed this Dec 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce Timing Measurement Scope to Bottom 80% of Tracked Functions #996

Reduce Timing Measurement Scope to Bottom 80% of Tracked Functions #996

TeachMeTW commented Nov 23, 2024

TeachMeTW commented Nov 23, 2024

TeachMeTW commented Nov 23, 2024

Reduce Timing Measurement Scope to Bottom 80% of Tracked Functions #996

Reduce Timing Measurement Scope to Bottom 80% of Tracked Functions #996

Conversation

TeachMeTW commented Nov 23, 2024

Summary

Changes

Context

Testing Plan

Next Steps

TeachMeTW commented Nov 23, 2024

Follow-up: Refine Timing Measurement for Additional Functions

Summary

Changes

Context

Next Steps

TeachMeTW commented Nov 23, 2024

Insights: