-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cheer Naive Comparison Notebook 2.0 #180
base: main
Are you sure you want to change the base?
Conversation
This notebook has various plotting functions that - show cumulative kg and kWh between naive and cheer - show emissions per km across programs - compare App 2014, Dashboard 2020, and CHEER - show cumulative distance across transportation modes with App 2014
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could definitely use some cleanup before we merge it - the main things in my mind would be moving the file to prevent confusion with the production scripts, general cleanup of any off formatting or comments that can be removed, and organization.
I put in a few comments with specific questions, as well.
While there is cleanup to be done here, I think the paper should be first priority.
"source": [ | ||
"df_pur = pd.read_csv(r'auxiliary_files/purpose_labels.csv')\n", | ||
"df_re = pd.read_csv(r'auxiliary_files/mode_labels.csv')\n", | ||
"df_ei = pd.read_csv(r'auxiliary_files/energy_intensity.csv')\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These aren't in the repo anymore, right? Maybe include a link that will get people to a point where they do exist, I think it makes sense to not log them in, but we want people to know where to find them
@@ -0,0 +1,2200 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be moved into a new folder, not in viz scripts, maybe make a "paper_visualizations" or "one_off_analysis" folder. There are a few (admittedly stale) branches of mine that should probably be checked into that folder too at some point.
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Here begins the naive calculation.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the auxiliary files csvs, can you add a link to the point where this code is scraped from? I think that will make it more clear where this came from (the old public dashboard code)
" expanded_ct = CO2_impact_lb(expanded_ct, 'distance_miles')\n", | ||
" return expanded_ct\n", | ||
"\n", | ||
"def get_quality_text(before_df, after_df, mode_of_interest=None, include_test_users=False):\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you use this? This is mainly for the final charts that end up in the dashboard ... if there are any functions that are unused, they should be cut.
"\n", | ||
" # prepare quality text and summary stats\n", | ||
" data_eb = expanded_ct.query(f\"Mode_confirm == '{mode_of_interest}'\") if \"Mode_confirm\" in expanded_ct.columns else expanded_ct\n", | ||
" quality_text = get_quality_text(expanded_ct, data_eb, mode_of_interest)\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You do set it, but you never use this variable - is it just for verifying the data quality as you run the notebook?
" :param file_suffix: Suffix for file names.\n", | ||
" \"\"\"\n", | ||
" # calculate cumulative emissions for each mode\n", | ||
" cumulative_emissions = expanded_ct_filtered.groupby('Mode_confirm')[[y_naive, y_cheer]].sum()\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that you're relying on expanded_ct_filtered
being in scope isn't necessarily wrong, but if this were production code we'd want it to be a parameter instead.
"source": [ | ||
"#\n", | ||
"#\n", | ||
"# TEMPORARY. REMOVE\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
time to remove for the final draft?
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"buses = canbikeco[canbikeco['data_user_input_mode_confirm'] == 'bus']\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is the best place for this, since it is more scratch work/investigation than the rest of it is
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Naive Squared" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is scraped from somewhere, please cite it with links
" cumulative_distance.columns = ['predicted_mode_name', 'cumulative_distance_km']\n", | ||
"\n", | ||
" # filter out modes with absolutely zero distance and exclude Tram and Unknown\n", | ||
" cumulative_distance_filtered = cumulative_distance[\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this section for the distane by sense mode chart you had made? Can you label it so it is more clear what it is
NOTE: The previous pull request was closed due to an accidental commit.
This notebook will only work if you have the particular CSV files for CanBikeCO, MassCEC, and Bull Durham from the TSDC.
Not only does this notebook compare Naive (Dashboard, 2020) with CHEER, but it also uses sections for comparing App, 2014 (sometimes referred to as naive-naive or naive-squared).