You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
edasmalchi opened this issue
Dec 17, 2024
· 2 comments
Labels
adminAdministrative workbugSomething isn't workingdataWork related to the management of datagtfs-rtWork related to GTFS-Realtimeopen-dataWork related to publishing, ingesting open data
Where did the bug occur?
Select from the below, and be sure to affix the appropriate label to this issue (e.g. dataset, jupyterhub, metabase, analysis.calitp.org)
Noticed that the script calls gtfs_segments between renaming that col and dropping it. We seem to now have version 2.1.7 instead of 0.1.0. I wonder if something about that package has changed...
@tiffanychu90 any ideas? I don't have time to look into this quite yet, but might be able to circle back before going on vacation Friday.
python cut_stop_segments.py
Traceback (most recent call last):
File "/opt/conda/lib/python3.9/site-packages/dask/dataframe/utils.py", line 195, in raise_on_meta_error
yield
File "/opt/conda/lib/python3.9/site-packages/dask/dataframe/core.py", line 6450, in _emulate
return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
File "/opt/conda/lib/python3.9/site-packages/dask/dataframe/utils.py", line 729, in drop_by_shallow_copy
df2.drop(columns=columns, inplace=True, errors=errors)
File "/opt/conda/lib/python3.9/site-packages/pandas/util/_decorators.py", line 331, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/pandas/core/frame.py", line 5399, in drop
return super().drop(
File "/opt/conda/lib/python3.9/site-packages/pandas/util/_decorators.py", line 331, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/pandas/core/generic.py", line 4505, in drop
obj = obj._drop_axis(labels, axis, level=level, errors=errors)
File "/opt/conda/lib/python3.9/site-packages/pandas/core/generic.py", line 4546, in _drop_axis
new_axis = axis.drop(labels, errors=errors)
File "/opt/conda/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 6934, in drop
raise KeyError(f"{list(labels[mask])} not found in axis")
KeyError: "['arrival_time1'] not found in axis"
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/jovyan/data-analyses/rt_segment_speeds/scripts/cut_stop_segments.py", line 138, in <module>
segments = cut_stop_segments(analysis_date)
File "/home/jovyan/data-analyses/rt_segment_speeds/scripts/cut_stop_segments.py", line 98, in cut_stop_segments
segments = (segments.drop(
File "/opt/conda/lib/python3.9/site-packages/dask/dataframe/core.py", line 5181, in drop
return self.map_partitions(drop_by_shallow_copy, columns, errors=errors)
File "/opt/conda/lib/python3.9/site-packages/dask/dataframe/core.py", line 867, in map_partitions
return map_partitions(func, self, *args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/dask/dataframe/core.py", line 6519, in map_partitions
meta = _get_meta_map_partitions(args, dfs, func, kwargs, meta, parent_meta)
File "/opt/conda/lib/python3.9/site-packages/dask/dataframe/core.py", line 6631, in _get_meta_map_partitions
meta = _emulate(func, *args, udf=True, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/dask/dataframe/core.py", line 6450, in _emulate
return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
File "/opt/conda/lib/python3.9/contextlib.py", line 137, in __exit__
self.gen.throw(typ, value, traceback)
File "/opt/conda/lib/python3.9/site-packages/dask/dataframe/utils.py", line 216, in raise_on_meta_error
raise ValueError(msg) from e
ValueError: Metadata inference failed in `drop_by_shallow_copy`.
You have supplied a custom function and Dask is unable to
determine the type of output that that function returns.
To resolve this please provide a meta= keyword.
The docstring of the Dask function you ran should have more information.
Original error is below:
------------------------
KeyError("['arrival_time1'] not found in axis")
Traceback:
---------
File "/opt/conda/lib/python3.9/site-packages/dask/dataframe/utils.py", line 195, in raise_on_meta_error
yield
File "/opt/conda/lib/python3.9/site-packages/dask/dataframe/core.py", line 6450, in _emulate
return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
File "/opt/conda/lib/python3.9/site-packages/dask/dataframe/utils.py", line 729, in drop_by_shallow_copy
df2.drop(columns=columns, inplace=True, errors=errors)
File "/opt/conda/lib/python3.9/site-packages/pandas/util/_decorators.py", line 331, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/pandas/core/frame.py", line 5399, in drop
return super().drop(
File "/opt/conda/lib/python3.9/site-packages/pandas/util/_decorators.py", line 331, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/pandas/core/generic.py", line 4505, in drop
obj = obj._drop_axis(labels, axis, level=level, errors=errors)
File "/opt/conda/lib/python3.9/site-packages/pandas/core/generic.py", line 4546, in _drop_axis
new_axis = axis.drop(labels, errors=errors)
File "/opt/conda/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 6934, in drop
raise KeyError(f"{list(labels[mask])} not found in axis")
The text was updated successfully, but these errors were encountered:
@edasmalchi: Closing because I already caught this...you can cherry pick this commit on this or rebase on main after I merge my #1325 in.
Yes, the package does change and requires an arrival_time column (because it's actually wrapped more closely with other stuff the package is able to do).
adminAdministrative workbugSomething isn't workingdataWork related to the management of datagtfs-rtWork related to GTFS-Realtimeopen-dataWork related to publishing, ingesting open data
Where did the bug occur?
Select from the below, and be sure to affix the appropriate label to this issue (e.g.
dataset
,jupyterhub
,metabase
,analysis.calitp.org
)Describe the bug
Seems like the script is trying to drop a column that no longer exists.
Noticed that the script calls gtfs_segments between renaming that col and dropping it. We seem to now have version 2.1.7 instead of 0.1.0. I wonder if something about that package has changed...
@tiffanychu90 any ideas? I don't have time to look into this quite yet, but might be able to circle back before going on vacation Friday.
To Reproduce
Run rt_segment_speeds pipeline
Expected behavior
rt_segment_speeds pipeline completes
Additional context
The text was updated successfully, but these errors were encountered: