Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

us_bea added BEAGDPv2 statvar processing files #1099

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions statvar_imports/us_bea/BEAGDPv2/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# GDP by County, Metro, and Other Areas

The data set contains GDP by county, Metro, and Other Areas

Download:

Data download URL : https://apps.bea.gov/regional/zip/CAGDP9.zip
Select the CAGDP9: Real GDP in chaied dollars by County & MSA


Processing:
Earlier code : https://source.corp.google.com/piper///depot/google3/datacommons/mcf/bea/v2/ ( 2017-2022 data)
Current execution : Using statvarProcessor.

File paths in gcs:

inut file : gs://unresolved_mcf/us_bea/gdp_chained_dollar_county_msa/20241028/input_files/CAGDP9__ALL_AREAS_2017_2022.csv
pv_map: gs://unresolved_mcf/us_bea/gdp_chained_dollar_county_msa/20241028/configs/pv_map.py
place mappings : gs://unresolved_mcf/us_bea/gdp_chained_dollar_county_msa/20241028/configs/place_mapping.json
config file: gs://unresolved_mcf/us_bea/gdp_chained_dollar_county_msa/20241028/configs/config.py
output files: gs://unresolved_mcf/us_bea/gdp_chained_dollar_county_msa/20241028/output_files/


Check for any addiitonal NAICS to be mapped from source and update the pv_map.py
Also any new place mappings has to be updated in the place_mappings.json with corresponding dcid
Used the statvar_remap to map the dcid generated in the format of existing dcid ( The existing statvar has measurement qualifier in the end of statvar whereas the script generates dcid has it in the beginning)

Execution step :

python3 {$SCRIPT_PATH}/stat_var_processor.py --pv_map={$INPUT_PATH}/pv_map.py,observationAbout:{$INPUT_PATH}/place_mapping.json --config={$INPUT_PATH}/config.py --input_data={$INPUT_PATH}/CAGDP9__ALL_AREAS_2017_2022.csv --output_path={$OUTPUT_PATH}/cagdp9 2>&1 | tee gdp.log

7 changes: 7 additions & 0 deletions statvar_imports/us_bea/BEAGDPv2/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
'header_rows' :1,
'input_encoding': 'latin-1',
'input_data_dialect': 'unix',
}


Loading