You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using altair with pandas dataframe with numpy-backed types and I using streamlit to visualize it. streamlit has pyarrow as dependency and it turns out that datatype inference using pyarrow fails for nullable boolean of pandas dtype. Small (unrealistic) example reproduces the error:
Traceback (most recent call last):
File "C:\Users\ad\AppData\Local\Programs\Python\Python311\Lib\runpy.py", line 198, in _run_module_as_main
return _run_code(code, main_globals, None,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\ad\AppData\Local\Programs\Python\Python311\Lib\runpy.py", line 88, in _run_code
exec(code, run_globals)
File "c:\Users\ad\.vscode\extensions\ms-python.python-2023.16.0\pythonFiles\lib\python\debugpy\adapter/../..\debugpy\launcher/../..\debugpy\__main__.py", line 39, in <module>
cli.main()
File "c:\Users\ad\.vscode\extensions\ms-python.python-2023.16.0\pythonFiles\lib\python\debugpy\adapter/../..\debugpy\launcher/../..\debugpy/..\debugpy\server\cli.py", line 430, in main
run()
File "c:\Users\ad\.vscode\extensions\ms-python.python-2023.16.0\pythonFiles\lib\python\debugpy\adapter/../..\debugpy\launcher/../..\debugpy/..\debugpy\server\cli.py", line 284, in run_file
runpy.run_path(target, run_name="__main__")
File "c:\Users\ad\.vscode\extensions\ms-python.python-2023.16.0\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_runpy.py", line 321, in run_path
return _run_module_code(code, init_globals, run_name,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\ad\.vscode\extensions\ms-python.python-2023.16.0\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_runpy.py", line 135, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "c:\Users\ad\.vscode\extensions\ms-python.python-2023.16.0\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_runpy.py", line 124, in _run_code
exec(code, run_globals)
File "test.py", line 18, in <module>
chart.save(file, format="html")
File "C:\Users\ad\AppData\Local\pypoetry\Cache\virtualenvs\ero-bIEndBiR-py3.11\Lib\site-packages\altair\vegalite\v5\api.py", line 1066, in save
result = save(**kwds)
^^^^^^^^^^^^
File "C:\Users\ad\AppData\Local\pypoetry\Cache\virtualenvs\ero-bIEndBiR-py3.11\Lib\site-packages\altair\utils\save.py", line 189, in save
perform_save()
File "C:\Users\ad\AppData\Local\pypoetry\Cache\virtualenvs\ero-bIEndBiR-py3.11\Lib\site-packages\altair\utils\save.py", line 127, in perform_save
spec = chart.to_dict(context={"pre_transform": False})
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\ad\AppData\Local\pypoetry\Cache\virtualenvs\ero-bIEndBiR-py3.11\Lib\site-packages\altair\vegalite\v5\api.py", line 2695, in to_dict
return super().to_dict(
^^^^^^^^^^^^^^^^
File "C:\Users\ad\AppData\Local\pypoetry\Cache\virtualenvs\ero-bIEndBiR-py3.11\Lib\site-packages\altair\vegalite\v5\api.py", line 903, in to_dict
vegalite_spec = super(TopLevelMixin, copy).to_dict( # type: ignore[misc]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\ad\AppData\Local\pypoetry\Cache\virtualenvs\ero-bIEndBiR-py3.11\Lib\site-packages\altair\utils\schemapi.py", line 965, in to_dict
result = _todict(
^^^^^^^^
File "C:\Users\ad\AppData\Local\pypoetry\Cache\virtualenvs\ero-bIEndBiR-py3.11\Lib\site-packages\altair\utils\schemapi.py", line 477, in _todict
return {k: _todict(v, context) for k, v in obj.items() if v is not Undefined}
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\ad\AppData\Local\pypoetry\Cache\virtualenvs\ero-bIEndBiR-py3.11\Lib\site-packages\altair\utils\schemapi.py", line 477, in <dictcomp>
return {k: _todict(v, context) for k, v in obj.items() if v is not Undefined}
^^^^^^^^^^^^^^^^^^^
File "C:\Users\ad\AppData\Local\pypoetry\Cache\virtualenvs\ero-bIEndBiR-py3.11\Lib\site-packages\altair\utils\schemapi.py", line 473, in _todict
return obj.to_dict(validate=False, context=context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\ad\AppData\Local\pypoetry\Cache\virtualenvs\ero-bIEndBiR-py3.11\Lib\site-packages\altair\utils\schemapi.py", line 965, in to_dict
result = _todict(
^^^^^^^^
File "C:\Users\ad\AppData\Local\pypoetry\Cache\virtualenvs\ero-bIEndBiR-py3.11\Lib\site-packages\altair\utils\schemapi.py", line 477, in _todict
return {k: _todict(v, context) for k, v in obj.items() if v is not Undefined}
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\ad\AppData\Local\pypoetry\Cache\virtualenvs\ero-bIEndBiR-py3.11\Lib\site-packages\altair\utils\schemapi.py", line 477, in <dictcomp>
return {k: _todict(v, context) for k, v in obj.items() if v is not Undefined}
^^^^^^^^^^^^^^^^^^^
File "C:\Users\ad\AppData\Local\pypoetry\Cache\virtualenvs\ero-bIEndBiR-py3.11\Lib\site-packages\altair\utils\schemapi.py", line 473, in _todict
return obj.to_dict(validate=False, context=context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\ad\AppData\Local\pypoetry\Cache\virtualenvs\ero-bIEndBiR-py3.11\Lib\site-packages\altair\vegalite\v5\schema\channels.py", line 34, in to_dict
parsed = parse_shorthand(shorthand, data=context.get('data', None))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\ad\AppData\Local\pypoetry\Cache\virtualenvs\ero-bIEndBiR-py3.11\Lib\site-packages\altair\utils\core.py", line 590, in parse_shorthand
attrs["type"] = infer_vegalite_type_for_dfi_column(column)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\ad\AppData\Local\pypoetry\Cache\virtualenvs\ero-bIEndBiR-py3.11\Lib\site-packages\altair\utils\core.py", line 639, in infer_vegalite_type_for_dfi_column
kind = column.dtype[0]
^^^^^^^^^^^^
File "properties.pyx", line 36, in pandas._libs.properties.CachedProperty.__get__
File "C:\Users\ad\AppData\Local\pypoetry\Cache\virtualenvs\ero-bIEndBiR-py3.11\Lib\site-packages\pandas\core\interchange\column.py", line 128, in dtype
return self._dtype_from_pandasdtype(dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\ad\AppData\Local\pypoetry\Cache\virtualenvs\ero-bIEndBiR-py3.11\Lib\site-packages\pandas\core\interchange\column.py", line 147, in _dtype_from_pandasdtype
byteorder = dtype.byteorder
^^^^^^^^^^^^^^^
AttributeError: 'BooleanDtype' object has no attribute 'byteorder'
And my environment:
altair 5.1.1 Vega-Altair: A declarative statistical visualization library for Python.
astroid 2.15.8 An abstract syntax tree for Python with inference support.
flake8 6.1.0 the modular source code checker: pep8 pyflakes and co
packaging 23.1 Core utilities for Python packages
pandas 2.1.1 Powerful data structures for data analysis, time series, and statistics
pathspec 0.11.2 Utility library for gitignore style pattern matching of file paths.
pillow 9.5.0 Python Imaging Library (Fork)
platformdirs 3.10.0 A small Python package for determining appropriate platform-specific dirs, e.g. a "user data dir".
pluggy 1.3.0 plugin and hook calling mechanisms for python
protobuf 4.24.3
pyarrow 13.0.0 Python library for Apache Arrow
requests 2.31.0 Python HTTP for Humans.
rich 13.5.3 Render rich text, tables, progress bars, syntax highlighting, markdown and more to the terminal
rpds-py 0.10.3 Python bindings to Rust's persistent data structures (rpds)
ruff 0.0.291 An extremely fast Python linter, written in Rust.
scipy 1.11.2 Fundamental algorithms for scientific computing in Python
six 1.16.0 Python 2 and 3 compatibility utilities
smmap 5.0.1 A pure Python implementation of a sliding window memory map manager
snakeviz 2.2.0 A web-based viewer for Python profiler output
sqlalchemy 2.0.21 Database Abstraction Library
streamlit 1.27.0 A faster way to build and share data apps
tabulate 0.9.0 Pretty-print tabular data
yamllint 1.32.0 A linter for YAML files.
zipp 3.17.0 Backport of pathlib-compatible object wrapper for zip files
Thank you for taking a looking and for making such a great tool!
The text was updated successfully, but these errors were encountered:
Thanks for the report @pavlomuts. This looks like something we'll need to report upstream to pandas and work around in Altair. I'll try to take a closer look soon.
Reported upstream in pandas-dev/pandas#55332 and worked around in #3210. We should be able to get this into the 5.1.2 release next week.
A workaround in the meantime is to specify the encoding type of the boolean column explicitly (e.g. for the default of nominal encoding use color="flag:N"):
I am using altair with
pandas
dataframe with numpy-backed types and I usingstreamlit
to visualize it.streamlit
haspyarrow
as dependency and it turns out that datatype inference using pyarrow fails for nullable boolean of pandas dtype. Small (unrealistic) example reproduces the error:Traceback:
And my environment:
Thank you for taking a looking and for making such a great tool!
The text was updated successfully, but these errors were encountered: