Skip to content

Commit

Permalink
Patched pandas warnings for SettingWithCopyWarnings and FutureWarnings
Browse files Browse the repository at this point in the history
Without the patches, running the processing script will make Pandas print 2 warnings, SettingWithCopyWarnings and FutureWarnings.
The SettingWithCopyWarnings came from the fact that chained indexing was performed on a few dataframes.
Since chained indexing may return a copy of the dataframe or dataframe itself, this causes ambiguity on
whether assignments to the indexed dataframes would change the original, or merely a copy.
This is resolved the warnings by explicitly using the copy() function to make copies of the dataframes.

As for the FutureWarnings, these were caused by assigning strings to empty columns, whose values default to NaN (float).
Since float is incompatible with strings, this raises a warning. This is fixed by explicitly type casting certain columns
  • Loading branch information
QuanMPhm committed May 7, 2024
1 parent cd231b4 commit 2416397
Showing 1 changed file with 4 additions and 3 deletions.
7 changes: 4 additions & 3 deletions process_report/process_report.py
Original file line number Diff line number Diff line change
Expand Up @@ -398,6 +398,7 @@ def add_institution(dataframe: pandas.DataFrame):
The list of mappings are defined in `institute_map.json`.
"""
institute_map = load_institute_map()
dataframe = dataframe.astype({INSTITUTION_FIELD: "str"})
for i, row in dataframe.iterrows():
pi_name = row[PI_FIELD]
if pandas.isna(pi_name):
Expand Down Expand Up @@ -438,9 +439,9 @@ def get_project(row):
else:
return project_alloc[: project_alloc.rfind("-")]

BU_projects = dataframe[dataframe[INSTITUTION_FIELD] == "Boston University"]
BU_projects = dataframe[dataframe[INSTITUTION_FIELD] == "Boston University"].copy()
BU_projects["Project"] = BU_projects.apply(get_project, axis=1)
BU_projects[SUBSIDY_FIELD] = 0
BU_projects[SUBSIDY_FIELD] = Decimal(0)
BU_projects = BU_projects[
[
INVOICE_DATE_FIELD,
Expand Down Expand Up @@ -496,7 +497,7 @@ def export_lenovo(dataframe: pandas.DataFrame, output_file):
SU_HOURS_FIELD,
SU_TYPE_FIELD,
]
]
].copy()

lenovo_df.rename(columns={SU_HOURS_FIELD: "SU Hours"}, inplace=True)
lenovo_df.insert(len(lenovo_df.columns), "SU Charge", SU_CHARGE_MULTIPLIER)
Expand Down

0 comments on commit 2416397

Please sign in to comment.