Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] cudf.to_datetime with format allows date strings with days beyond month limit #14040

Closed
mroeschke opened this issue Sep 6, 2023 · 3 comments
Assignees
Labels
bug Something isn't working Python Affects Python cuDF API.

Comments

@mroeschke
Copy link
Contributor

Describe the bug
cudf.to_datetime with format allows date strings with dates beyond month limit

Steps/Code to reproduce bug

In [7]: cudf.to_datetime("2015-02-99", format="%Y-%m-%d")
Out[7]: numpy.datetime64('2015-05-10T00:00:00.000000000')

In [8]: cudf.to_datetime("2015-02-99")
ValueError: Given date string not likely a datetime.  # OK

Expected behavior
Ideally a ValueError should be raised for Out[7]:

Environment overview (please complete the following information)

  • Environment location: [Bare-metal, Docker, Cloud(specify cloud provider)]
  • Method of cuDF install: [conda, Docker, or from source]
    • If method of install is [Docker], provide docker pull & docker run commands used

Environment details
Please run and paste the output of the cudf/print_env.sh script here, to gather any other relevant environment details

Additional context
Add any other context about the problem here.

@mroeschke mroeschke added bug Something isn't working Python Affects Python cuDF API. labels Sep 6, 2023
@mroeschke mroeschke changed the title [BUG] cudf.to_datetime with format allows date strings with dates beyond month limit [BUG] cudf.to_datetime with format allows date strings with days beyond month limit Sep 6, 2023
@mroeschke
Copy link
Contributor Author

Looks like validation of the parsed string should occur here

// special logic for each specifier

@davidwendt
Copy link
Contributor

Validation is not done on data in general. Call cudf::strings::is_timestamp() if validation is required.

@mroeschke
Copy link
Contributor Author

is_timestamp is what I'm looking for, thanks!

@mroeschke mroeschke self-assigned this Sep 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Affects Python cuDF API.
Projects
None yet
Development

No branches or pull requests

2 participants