Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pep_sex_2024 changes made #1110

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open

Conversation

kurus21
Copy link

@kurus21 kurus21 commented Nov 6, 2024

No description provided.

@krishnaswamypradeep
Copy link

@kurus21 Can you remove input & output folder and confirm?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this file

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The File has been removed.

Copy link

@krishnaswamypradeep krishnaswamypradeep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Kuru. Looks good.

skiprows=7,
skipfooter=102,
header=None)
df.columns = [
Copy link
Contributor

@ajaits ajaits Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls use df.rename() instead of assuming column order.

skipfooter=102,
header=None)
df.columns = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls use df.rename()

'White Total', 'White Male', 'White Female', 'NonWhite Total',
'NonWhite Male', 'NonWhite Female'
]
df = df.drop(columns=[
Copy link
Contributor

@ajaits ajaits Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more readable to list columns of interest to be retained:
df.drop(columns=df.columns.difference(['Count_Person_Male', 'Count_Person_Female']), inplace=True)

Then it can be moved outside the if/else block

Comment on lines 158 to 161
# adding geoid, year and measurement method
df['Year'] = year
df.insert(0, 'geo_ID', 'country/USA', True)
df['Measurement_Method'] = 'dcAggregate/CensusPEPSurvey_PartialAggregate'
Copy link
Contributor

@ajaits ajaits Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems common to both if and else and can be moved out.

for col in float_col.columns.values:
df[col] = df[col].astype('int64')
df[col] = df[col].astype("str").str.replace("-1", "")
df.rename(columns={'SEX': 'Year'}, inplace=True)
Copy link
Contributor

@ajaits ajaits Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the column 'SEX' being renamed to 'Year' here and in functions below.

'POPEST_FEM': 'Count_Person_Female',
'YEAR': 'Year'
})
df = df.drop(columns=[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be easier to to do df.drop(columns=df.columns.difference([])..)

'Count_Person_Male', 'Count_Person_Female'
]
df = pd.read_excel(file_path, skiprows=5, skipfooter=7, header=None)
df.columns = column_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls use df.rename()

'July2022Female',
'July2023Male',
'July2023Female',
'2023Total',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we generalize this to 2024 and future years?

"sc-est2023-syasex-": _state_2023,
"sc-est2023-agesex-": _state_2023,
"cc-est2023-agesex-": _county_2023,
"cc-est2023-agesex-a": _county_2023
Copy link
Contributor

@ajaits ajaits Nov 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also extend to handle future years assuming the same format?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants