Research Request - switch dask.delayed
to dask.from_map
#1299
Labels
research request
Issues that serve as a request for research (summary and handoff)
tooling
Work related to the management of our tooling and shared modules
Complete the below when receiving a research request, and continue to add to this issue as you receive additional details and produce deliverables. Be sure to also add the appropriate project-level label to this issue (eg gtfs-rt, DLA).
Research Question
Single sentence description: The concatenation of segment speeds (lots of segments + geometry) over the last 2 years is taking quite awhile to produce year averages. The concatenation relies on
dask.delayed
, but the docs indicate there's adask.from_map
syntax that could be more desirable to use.Detailed description:
rt_segment_speeds/scripts/quarter_year_averages.py
script and see if these can move todask.from_map
segment_geometry
is present and merged in every single date we have, but this is not necessarily desirable...we should dedupe more efficiently.time_series_utils
to see if we can switch out the concatenation step and generalize a bit more to take any processed dataframe going into GTFS digestData sources
Deliverables
Utility functions + updated scripts
The text was updated successfully, but these errors were encountered: