-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DateutilTimePeriodAdjuster
#1233
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks! nitty nitty inline but the only really important thing is to sync up with @courtneyholcomb about the sub-daily granularity transition and whether exceptions are preferable to silent date assumptions.
if time_granularity is TimeGranularity.DAY: | ||
return date_to_adjust |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumes all input has a minimum granularity of daily and silently returns whatever is passed in. The Pandas implementation raises an exception in this case, which might be preferable for testing scenarios.
I'm fine with either approach since we're going to update this within the next couple of weeks, but @courtneyholcomb is going to be doing the work on opening up granularity so maybe check in with her about what she prefers before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pinged her, but preferring to fix later if there are issues to unblock merging of the stack.
elif time_granularity is TimeGranularity.WEEK: | ||
return date_to_adjust + relativedelta(weekday=dateutil.relativedelta.MO(-1)) | ||
elif time_granularity is TimeGranularity.MONTH: | ||
return date_to_adjust + relativedelta(day=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API.... TIL day=1
and days=1
are vastly different things.
Oh well, it works, just read very carefully is all.....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improved documentation and more examples of that API would have helped - had to do play around with the methods in the REPL.
elif time_granularity is TimeGranularity.WEEK: | ||
return date_to_adjust + relativedelta(weekday=dateutil.relativedelta.SU(1)) | ||
elif time_granularity is TimeGranularity.MONTH: | ||
return date_to_adjust + relativedelta(day=31) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting that docstring note on the class, I would've wondered about this.
def date_times_to_check() -> Sequence[datetime.datetime]: # noqa: D103 | ||
date_times = [] | ||
# Cover regular and leap years. | ||
start_date_time = datetime.datetime(year=2020, month=1, day=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check the year 2000, which looks like a leap year and is? And also 1900, which looks like a leap year, but isn't?
I hate date/time operations.....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both 1900 and 2000 work, but 1900 exceeds the limit of supported times, so that was left as a comment.
), f"Expansion mismatch: {pandas_adjuster_result=} {dateutil_adjuster_result=} {time_granularity=}" | ||
finished_count += 1 | ||
if finished_count % 100000 == 0 or finished_count == test_case_count: | ||
logger.info(f"Progress {finished_count / test_case_count * 100:.0f}%") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How many INFO rows does this print? If it's a lot we might want to put it in debug just so it's not filling up CI logs, but I assume it' snot that many since it's one every 100k.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<10 lines
for time_granularity in TimeGranularity | ||
for end_time in ( | ||
start_time + datetime.timedelta(days=day_offset) | ||
for day_offset in range(grain_to_count_in_year[time_granularity] + 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+2
? Why plus two? To guarantee a spillover across the year boundary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, added a comment about it.
@@ -0,0 +1,3652 @@ | |||
Date Period Grain Period Start Period End |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a leftover from an earlier version of the test. Remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
fa6d0a0
to
2f50de9
Compare
This adds an implementation for adjusting time periods using `dateutil` instead of `pandas`. Later, `DateutilTimePeriodAdjuster` will replace the `pandas` implementation.
Description
This adds an implementation for adjusting time periods using
dateutil
insteadof
pandas
.