Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with date parsing in parquet files #20

Open
adrienyhuel opened this issue Feb 13, 2024 · 3 comments
Open

Problem with date parsing in parquet files #20

adrienyhuel opened this issue Feb 13, 2024 · 3 comments

Comments

@adrienyhuel
Copy link

Hello,

We're trying to use your software, but we have an error with dates in parquet files.

[2024-02-13, 01:02:55 UTC] {pod_manager.py:447} INFO - [base] thread 'tokio-runtime-worker' panicked at /cargo/registry/src/index.crates.io-6f17d22bba15001f/chrono-0.4.31/src/offset/mod.rs:360:41:
[2024-02-13, 01:02:55 UTC] {pod_manager.py:447} INFO - [base] No such local time
[2024-02-13, 01:02:55 UTC] {pod_manager.py:447} INFO - [base] note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
[2024-02-13, 01:02:55 UTC] {pod_manager.py:447} INFO - [base] thread 'main' panicked at src/main.rs:164:33:
[2024-02-13, 01:02:55 UTC] {pod_manager.py:447} INFO - [base] called Result::unwrap() on an Err value: JoinError::Panic(Id(14), ...)

It looks like this is the issue described here :
apache/arrow-rs#3430

parquet need to be updated to >=30.0.1.
I tried to update to 30.0.1, but I'm new to Rust, and building gave me errors with clap package in main.rs code (parquet lib needs to update clap from 3.x to 4.x)

Could you provide some help ?

@jupiter
Copy link
Owner

jupiter commented Feb 13, 2024

To upgrade parquet to >=25.0.0, we need to upgrade clap as well. As you've found, this is a breaking change, changing the API fairly substantially.

We have also recently discovered a performance regression for remote files when upgrading parquet to >=21.0.0, which will need some work to resolve.

We have scheduled the work to upgrade parquet to the latest version for some time in the next 2 weeks.

If you need a fix sooner than that, you could have a go at a PR. (If you only use local files you wouldn't need to get into resolving the performance issue, and just do a local build in order to use the binary.)

@jupiter
Copy link
Owner

jupiter commented Apr 5, 2024

I've updated to Parquet v51. It would be great if you can test whether this solves your problem. If not, please share a parquet file I can test against.

@eldario
Copy link

eldario commented Oct 31, 2024

Had a similar problem.
Dates in the format Timestamp(Millis, Some('UTC')) were not converted.

But adding chono-tz helped

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants