-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
75 add option to sum in editing rates #96
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tab1tha Can you please review everything down to and including tests.
@amynickolls Can we discuss utils and views?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tab1tha can you review this html please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tab1tha and again please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amynickolls Again, I'm not quite sure of the rationale behind these changes and why the old method using combine_first() wouldn't have worked. Moreover, it looks to me like the method of comparing data with number_adjustments will always result in the old data being used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combine first on paper should have worked but when testing it wasn't prioritising 0's coming in from the new data. I don't understand your comment re: old data being used - we want to keep old data where there isn't new data coming in.
|
||
|
||
@register.filter | ||
def convert_df_and_dynamicrateform_to_table(df, form): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious about these functions - is the end goal to create tables from the dynamic form?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, and why is the html built here rather than using a template and passing data to fill in the template?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed on our call, happy for this to be a refactor in another ticket that we can tackle later 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think both of us might have made an issue for this in different words
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice test suite 🤓
dm_regional_app/utils.py
Outdated
# Check for Series | ||
if "__type__" in obj and obj["__type__"] == "pd.Series": | ||
if obj.get("is_multiindex", False): | ||
index = pd.MultiIndex.from_tuples(obj["index"]) | ||
index = pd.MultiIndex.from_tuples( | ||
obj["index"], names=obj.get("index_names") | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my understanding - when do Series need to be handled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's now only adjusted_proportions and adjusted_costs that are series now - none will actually be multiindex now that the rates will be dataframes. We can chat through this in more detail!
dm_regional_app/utils.py
Outdated
if obj.get("is_multiindex", False): | ||
index = pd.MultiIndex.from_tuples( | ||
obj["index"], names=obj.get("index_names") | ||
) | ||
return pd.DataFrame(obj["data"], columns=obj["columns"], index=index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Entirely possible this is just pandas
notation I don't understand 😅 but this conditional looks like if obj
is not multiindex, then it is read as MultiIndex?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had thought this but as it was working as intended left it. On further investigation it was working because (I think) the check was just seeing if it could get "is_multiindex" as when I changed the logic to if obj.get("is_multiindex") == True:
it stopped working. No idea why it wasn't breaking on single index series/dfs.
On investigating this I'm noticing some weird behaviour - whenever I'm saving to the scenario, anything I've already saved is rerunning through the encoder. Is this expected behaviour?
e.g I'm saving adjusted rates but have previously saved entry rates - this will encode both items? Might be worth discussing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah sounds like it's worth discussing this one too!
if "__type__" in obj and obj["__type__"] == "pd.Series": | ||
if obj.get("is_multiindex", False): | ||
index = pd.MultiIndex.from_tuples(obj["index"]) | ||
if obj.get("is_multiindex") == True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can do
if obj.get("is_multiindex") == True: | |
if obj.get("is_multiindex"): |
but up to you on what you think is more readable!
if isinstance(obj, pd.DataFrame): | ||
# Handle MultiIndex DataFrame | ||
if isinstance(obj.index, pd.MultiIndex): | ||
index = [ | ||
list(tup) for tup in obj.index | ||
] # Convert MultiIndex to list of lists | ||
index_names = obj.index.names # Get index names | ||
else: | ||
index = obj.index.tolist() | ||
index_names = obj.index.names # Get index names |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's more of a pythonic question here around separation of concerns.
As python is not a typed language (but we do introduce some via the typing
library) when there are type checks within functions (both here and above) it makes me think about if we are trying to do "too much" with one class. I'm curious for your thoughts on having two encoders and decoders - one that handles series and one that handles dataframes. In my mind that would help with clarity i.e. DataFrameAwareJSONEncoder
vs SeriesAwareJSONEncoder
. And could help with tests as well - more specific tests to target individual functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cyramic I would also be curious for your thoughts here.
@amynickolls I don't see this as a blocker to the code merge - just an interesting question that could shape some of our design decisions going forward!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't mind splitting them - they do handle dates as well which I think would need to be a secondary function of each en/decoder anyway (although I can test this - it's possible that this was mainly an issue in the dictionaries, but assume it would be the same for dates in columns).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that separating them would make them more maintainable and more easily testable. I'm also thinking that since they are similar structures there may be an argument for maybe a design pattern to do this such that all encoders defined of this type follow a standard structure. Maybe something like a Builder or maybe one of the Factory patterns? That might be overdoing it though, so happy to get push back as I don't want to over-engineer this. What do you think @christina-moore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah nice, I think a Factory/AbstractFactory pattern could be useful here - it doesn't feel like there is quite enough complexity/steps for the Builder, but it would be good to implement the open/closed principle for future updates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy for this to be pulled into another PR/issue for clarity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy with this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy with this - thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy with the change to a function - thanks!
Users can now choose between multiplying or summing rates when editing these (exit rates, entry rate, and transition rates)
DynamicRateForm
Takes a dataframe, from which it instantiates multiply and add fields
Outputs a dataframe from the save function with multiply and add columns, leaves out rows where both values are empty
Negative numbers not allowed in multiply fields
One one of multiply or add can be used in each row
Displays initial data where previous values have been saved
Updates to JSONencoder/decoder to handle dataframes (both multiindex and single index)
Updates to combine_rates in Multinomial to handle new logic: rate * rate_change or + rate_change dependant on column. No longer left joins so users can edit rates at 0 in original prediction.
Updates to templatetags to show field errors in forms.
Unit tests added for DynamicRateForm