New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Cheer Naive Comparison Notebook 2.0 #180

Open

jpfleischer wants to merge 10 commits into e-mission:main from jpfleischer:cheernaive

Contributor

jpfleischer commented Dec 16, 2024

NOTE: The previous pull request was closed due to an accidental commit.

This notebook will only work if you have the particular CSV files for CanBikeCO, MassCEC, and Bull Durham from the TSDC.
Not only does this notebook compare Naive (Dashboard, 2020) with CHEER, but it also uses sections for comparing App, 2014 (sometimes referred to as naive-naive or naive-squared).

jpfleischer added 10 commits

October 15, 2024 21:11


          Change internal container ports and update documentation

10169aa


          Add build command to README

6d6ab25


          Add rebuild Docker clarification

4fb71cb


          Remove unnecessary dev build comment

b7d0cbf


          Add naive cheer comparison notebook

9d51271


          Add Bull and Mass datasets

d855b86


          Add latest naive CHEER comparison notebook

d8c46df

This notebook has various plotting functions that
- show cumulative kg and kWh between naive and cheer
- show emissions per km across programs
- compare App 2014, Dashboard 2020, and CHEER
- show cumulative distance across transportation modes with App 2014


          Undo port changes

94939b2


          Reverted files to match upstream/main

e82b36f


          Undo README change

3064b9e

jpfleischer mentioned this pull request

Naive vs CHEER analysis measurement #168

Open

Abby-Wheelis reviewed

View reviewed changes

Member

Abby-Wheelis left a comment

This could definitely use some cleanup before we merge it - the main things in my mind would be moving the file to prevent confusion with the production scripts, general cleanup of any off formatting or comments that can be removed, and organization.

I put in a few comments with specific questions, as well.

While there is cleanup to be done here, I think the paper should be first priority.

viz_scripts/naive_cheer_comparison.ipynb

+                 "source": [
+                  "df_pur = pd.read_csv(r'auxiliary_files/purpose_labels.csv')\n",
+                  "df_re = pd.read_csv(r'auxiliary_files/mode_labels.csv')\n",
+                  "df_ei = pd.read_csv(r'auxiliary_files/energy_intensity.csv')\n",

Member

Abby-Wheelis Dec 19, 2024

These aren't in the repo anymore, right? Maybe include a link that will get people to a point where they do exist, I think it makes sense to not log them in, but we want people to know where to find them

viz_scripts/naive_cheer_comparison.ipynb

		@@ -0,0 +1,2200 @@
		{

Member

Abby-Wheelis Dec 19, 2024

I think this should be moved into a new folder, not in viz scripts, maybe make a "paper_visualizations" or "one_off_analysis" folder. There are a few (admittedly stale) branches of mine that should probably be checked into that folder too at some point.

viz_scripts/naive_cheer_comparison.ipynb

+                 "metadata": {},
+                 "outputs": [],
+                 "source": [
+                  "# Here begins the naive calculation.\n",

Member

Abby-Wheelis Dec 19, 2024

Similar to the auxiliary files csvs, can you add a link to the point where this code is scraped from? I think that will make it more clear where this came from (the old public dashboard code)

viz_scripts/naive_cheer_comparison.ipynb

+                  "    expanded_ct = CO2_impact_lb(expanded_ct, 'distance_miles')\n",
+                  "    return expanded_ct\n",
+                  "\n",
+                  "def get_quality_text(before_df, after_df, mode_of_interest=None, include_test_users=False):\n",

Member

Abby-Wheelis Dec 19, 2024

Do you use this? This is mainly for the final charts that end up in the dashboard ... if there are any functions that are unused, they should be cut.

viz_scripts/naive_cheer_comparison.ipynb

+                  "\n",
+                  "    # prepare quality text and summary stats\n",
+                  "    data_eb = expanded_ct.query(f\"Mode_confirm == '{mode_of_interest}'\") if \"Mode_confirm\" in expanded_ct.columns else expanded_ct\n",
+                  "    quality_text = get_quality_text(expanded_ct, data_eb, mode_of_interest)\n",

Member

Abby-Wheelis Dec 19, 2024

You do set it, but you never use this variable - is it just for verifying the data quality as you run the notebook?

viz_scripts/naive_cheer_comparison.ipynb

+                  "        :param file_suffix: Suffix for file names.\n",
+                  "        \"\"\"\n",
+                  "        # calculate cumulative emissions for each mode\n",
+                  "        cumulative_emissions = expanded_ct_filtered.groupby('Mode_confirm')[[y_naive, y_cheer]].sum()\n",

Member

Abby-Wheelis Dec 19, 2024

The fact that you're relying on expanded_ct_filtered being in scope isn't necessarily wrong, but if this were production code we'd want it to be a parameter instead.

viz_scripts/naive_cheer_comparison.ipynb

+                 "source": [
+                  "#\n",
+                  "#\n",
+                  "# TEMPORARY. REMOVE\n",

Member

Abby-Wheelis Dec 19, 2024

time to remove for the final draft?

viz_scripts/naive_cheer_comparison.ipynb

+                 "metadata": {},
+                 "outputs": [],
+                 "source": [
+                  "buses = canbikeco[canbikeco['data_user_input_mode_confirm'] == 'bus']\n",

Member

Abby-Wheelis Dec 19, 2024

I'm not sure this is the best place for this, since it is more scratch work/investigation than the rest of it is

viz_scripts/naive_cheer_comparison.ipynb

+                 "metadata": {},
+                 "outputs": [],
+                 "source": [
+                  "# Naive Squared"

Member

Abby-Wheelis Dec 19, 2024

If this is scraped from somewhere, please cite it with links

viz_scripts/naive_cheer_comparison.ipynb

+                  "    cumulative_distance.columns = ['predicted_mode_name', 'cumulative_distance_km']\n",
+                  "\n",
+                  "    # filter out modes with absolutely zero distance and exclude Tram and Unknown\n",
+                  "    cumulative_distance_filtered = cumulative_distance[\n",

Member

Abby-Wheelis Dec 19, 2024

Is this section for the distane by sense mode chart you had made? Can you label it so it is more clear what it is

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet