updated README to reflect notes from most recent script run

cal-itp · Nov 18, 2024 · 7277e64 · 7277e64
1 parent f848937
commit 7277e64
Showing 1 changed file with 23 additions and 11 deletions.
diff --git a/dla/iija/README.md b/dla/iija/README.md
@@ -3,17 +3,29 @@
 This folder includes the data exploration, cleaning and script for the IIJA program data that is uploaded to [RebuildingCA](https://rebuildingca.ca.gov/map/??). 
 
 #### To run the script, run through the following steps:
-1. Upload exported data from FMIS (CSV or Excel) to the GCS Bucket
-2. Open `run_script.ipynb` and change path to relfect the uploaded exported data
-3. Check that the exported file has no empty first rows
-4. Run cells up to `Test & Export`
-5. In the `Test & Export` section, run the function 
-<blockquote>`_script_utils.get_clean_data()`</blockquote> 
-to get the final cleaned data. If you want to get the data aggregated to the program level, use kwargs <blockquote>_script_utils.get_clean_data(df, full_or_agg = 'agg')</blockquote>
-If you want the full dataframe where each row is a project phase, use kwarg
-<blockquote>_script_utils.get_clean_data(df, full_or_agg = 'full')</blockquote>
-
-Note: In the aggregated data, a project can have more than one row, if the project is funded with more than one IIJA program. 
+0. Receive list of FMIS IIJA funded projects from DOTP Office of Technical Freight & Project Integration
+
+1. Upload list (CSV or Excel) to the GCS Bucket `/data-analyses/dla/dla-iija/`
+
+2. In terminal, cd to `data-analyses/dla`, then run `make setup_env` to install requirements. cd to `iija` afterwards. 
+
+    **Note:** install may fail during conda install, but should still be able to continue with the rest of these instructions.
+
+3. Open `run_script.ipynb`, in the `Read in Data and function development` section update the `my_file` path to the latest IIJA project list in GCS bucket.
+
+4. In the `Check Data` section, run the cells to ensure the dataframe has no empty first rows
+
+5. In the `Run Script` section, run the `_script_utils.get_clean_data()` function to get the final cleaned data. 
+
+    Alternatively, If you want to get the data aggregated to the program level, use this kwarg in the fucntion 
+    <blockquote>_script_utils.get_clean_data(df, full_or_agg = 'agg')</blockquote>
+    Or, If you want the full dataframe where each row is a project phase, use this kwarg in the fucntion
+    <blockquote>_script_utils.get_clean_data(df, full_or_agg = 'full')</blockquote>
+
+6. In the `Export Data` section, Use the current date `(MM/DD/YYYY)` to rename the file in function. Then, run the `_script_utils.export_to_gcs()` function to export the data to GCS.  
+    The data can be found in the same file path stated in the previous steps, with the file name `FMIS_Projects_Universe_IIJA_Reporting_*.csv`
+
+**Note:** In the aggregated data, a project can have more than one row, if the project is funded with more than one IIJA program. 
 
 
 #### Scripts