Skip to content

Commit

Permalink
updated README to reflect notes from most recent script run
Browse files Browse the repository at this point in the history
  • Loading branch information
csuyat-dot committed Nov 18, 2024
1 parent f848937 commit 7277e64
Showing 1 changed file with 23 additions and 11 deletions.
34 changes: 23 additions & 11 deletions dla/iija/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,29 @@
This folder includes the data exploration, cleaning and script for the IIJA program data that is uploaded to [RebuildingCA](https://rebuildingca.ca.gov/map/??).

#### To run the script, run through the following steps:
1. Upload exported data from FMIS (CSV or Excel) to the GCS Bucket
2. Open `run_script.ipynb` and change path to relfect the uploaded exported data
3. Check that the exported file has no empty first rows
4. Run cells up to `Test & Export`
5. In the `Test & Export` section, run the function
<blockquote>`_script_utils.get_clean_data()`</blockquote>
to get the final cleaned data. If you want to get the data aggregated to the program level, use kwargs <blockquote>_script_utils.get_clean_data(df, full_or_agg = 'agg')</blockquote>
If you want the full dataframe where each row is a project phase, use kwarg
<blockquote>_script_utils.get_clean_data(df, full_or_agg = 'full')</blockquote>

Note: In the aggregated data, a project can have more than one row, if the project is funded with more than one IIJA program.
0. Receive list of FMIS IIJA funded projects from DOTP Office of Technical Freight & Project Integration

1. Upload list (CSV or Excel) to the GCS Bucket `/data-analyses/dla/dla-iija/`

2. In terminal, cd to `data-analyses/dla`, then run `make setup_env` to install requirements. cd to `iija` afterwards.

**Note:** install may fail during conda install, but should still be able to continue with the rest of these instructions.

3. Open `run_script.ipynb`, in the `Read in Data and function development` section update the `my_file` path to the latest IIJA project list in GCS bucket.

4. In the `Check Data` section, run the cells to ensure the dataframe has no empty first rows

5. In the `Run Script` section, run the `_script_utils.get_clean_data()` function to get the final cleaned data.

Alternatively, If you want to get the data aggregated to the program level, use this kwarg in the fucntion
<blockquote>_script_utils.get_clean_data(df, full_or_agg = 'agg')</blockquote>
Or, If you want the full dataframe where each row is a project phase, use this kwarg in the fucntion
<blockquote>_script_utils.get_clean_data(df, full_or_agg = 'full')</blockquote>

6. In the `Export Data` section, Use the current date `(MM/DD/YYYY)` to rename the file in function. Then, run the `_script_utils.export_to_gcs()` function to export the data to GCS.
The data can be found in the same file path stated in the previous steps, with the file name `FMIS_Projects_Universe_IIJA_Reporting_*.csv`

**Note:** In the aggregated data, a project can have more than one row, if the project is funded with more than one IIJA program.


#### Scripts
Expand Down

0 comments on commit 7277e64

Please sign in to comment.