-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent index name collision on index reset for interpolated df #16
Conversation
…ion just to skip the merge step, when there really is not significant perf gains to be achieved through this
@kuanb Thanks for taking a look at this! I see you're still pushing commits so let me know when this is ready to review. |
Ready to review :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me 👍
Thanks again for this. As I think you figured out, we're only using the I'm curious though, how did you end up with a DataFrame at this point (i.e. line 301) with |
No prob! I've got this running example file here. Running that will create the situation. I didn't flow upstream to where the |
Walking through it right now to provide you with a better answer: Logged the index name to both inputs to So I walked through the function and pinpointed where it occurs (line 284):
That Output from print statements:
|
That's really interesting. My |
Not sure off the top of my head but if it is, that would be an excellent argument for locking in dependency versions (right now they are all Also, on a related note, if dependencies are creating different results, this would be another argument in support of developing in a prespecified environment. This is what I've been doing - my Dockerfile (#12) creates the exact environment described in the Note: I am using the same version of Pandas:
|
I think I figured out why this is happening. The Madison data is clean (i.e. no data to interpolate), so the Thanks again! 👍 |
Small PR. In the associated file,
interpolatestoptimes
resets the dataframedf_for_interpolation
. Ifdf_for_interpolation
is indexed by one of the columns, thereset_index
operation will error because the existing column name will conflict with the one to be inserted during the index reset.Here's an example traceback:
By setting
drop
toTrue
, we avoid this.