diff --git a/dla/iija/README.md b/dla/iija/README.md index d480e3346..11d57b5f2 100644 --- a/dla/iija/README.md +++ b/dla/iija/README.md @@ -3,17 +3,29 @@ This folder includes the data exploration, cleaning and script for the IIJA program data that is uploaded to [RebuildingCA](https://rebuildingca.ca.gov/map/??). #### To run the script, run through the following steps: -1. Upload exported data from FMIS (CSV or Excel) to the GCS Bucket -2. Open `run_script.ipynb` and change path to relfect the uploaded exported data -3. Check that the exported file has no empty first rows -4. Run cells up to `Test & Export` -5. In the `Test & Export` section, run the function -
`_script_utils.get_clean_data()`
-to get the final cleaned data. If you want to get the data aggregated to the program level, use kwargs
_script_utils.get_clean_data(df, full_or_agg = 'agg')
-If you want the full dataframe where each row is a project phase, use kwarg -
_script_utils.get_clean_data(df, full_or_agg = 'full')
- -Note: In the aggregated data, a project can have more than one row, if the project is funded with more than one IIJA program. +0. Receive list of FMIS IIJA funded projects from DOTP Office of Technical Freight & Project Integration + +1. Upload list (CSV or Excel) to the GCS Bucket `/data-analyses/dla/dla-iija/` + +2. In terminal, cd to `data-analyses/dla`, then run `make setup_env` to install requirements. cd to `iija` afterwards. + + **Note:** install may fail during conda install, but should still be able to continue with the rest of these instructions. + +3. Open `run_script.ipynb`, in the `Read in Data and function development` section update the `my_file` path to the latest IIJA project list in GCS bucket. + +4. In the `Check Data` section, run the cells to ensure the dataframe has no empty first rows + +5. In the `Run Script` section, run the `_script_utils.get_clean_data()` function to get the final cleaned data. + + Alternatively, If you want to get the data aggregated to the program level, use this kwarg in the fucntion +
_script_utils.get_clean_data(df, full_or_agg = 'agg')
+ Or, If you want the full dataframe where each row is a project phase, use this kwarg in the fucntion +
_script_utils.get_clean_data(df, full_or_agg = 'full')
+ +6. In the `Export Data` section, Use the current date `(MM/DD/YYYY)` to rename the file in function. Then, run the `_script_utils.export_to_gcs()` function to export the data to GCS. + The data can be found in the same file path stated in the previous steps, with the file name `FMIS_Projects_Universe_IIJA_Reporting_*.csv` + +**Note:** In the aggregated data, a project can have more than one row, if the project is funded with more than one IIJA program. #### Scripts