Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatible PyArrow version causing issues with Parquet file reading in fink_science library #865

Closed
fjammes opened this issue Jul 15, 2024 · 1 comment · Fixed by #866
Assignees

Comments

@fjammes
Copy link
Contributor

fjammes commented Jul 15, 2024

Describe the issue
When trying to read a Parquet file using the fink_science library, I'm encountering an issue where the library is unable to find a usable engine for the Parquet file. The error message indicates that a suitable version of PyArrow or Fastparquet is required, and that the currently installed version of PyArrow (9.0.0) is not compatible with the required version (10.0.1).

Expected behaviour
I expected the fink_science library to be able to read the Parquet file without any issues, as I have PyArrow installed in my environment.

Actual behaviour
The actual behaviour is that the fink_science library is unable to find a usable engine for the Parquet file, and is raising an ImportError with the following message:

ImportError: Unable to find a usable engine; tried using: 'pyarrow', 'fastparquet'.
A suitable version of pyarrow or fastparquet is required for parquet support.
Trying to import the above resulted in these errors:
 - Pandas requires version '10.0.1' or newer of 'pyarrow' (version '9.0.0' currently installed).

To Reproduce

Launch the CI script with commit 117d5f80b41decc35072e5e95c5238f5f55d24a7 of fink-broker main (master) branch

3. The code failed with the `ImportError` mentioned above.

**System Information:**
 - Operating System: [`Linux`]
 - Fink Version: [`0.1.0`]
 - Occurred on which branch and with what commit: [`master-abc456d`]

**Additional context**
_The issue seems to be related to the version of PyArrow installed in my environment. I need to upgrade PyArrow to version 10.0.1 or newer to resolve the compatibility issue with the `fink_science` library._
@fjammes fjammes self-assigned this Jul 15, 2024
@fjammes
Copy link
Contributor Author

fjammes commented Jul 15, 2024

It seems that all requirements.txt file must be concatenated for each pip install step, if not pip might install a depedencies in a version that will upgrade a previously installed depedencies. We met this problem with statsmodels: #11 119.3 INFO: pip is looking at multiple versions of statsmodels to determine which version is compatible with other requirements. This could take a while. Thanks @JulienPeloton for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant