Skip to content

Stage 0 Setting Pentaho Vars

wolfderby edited this page Sep 7, 2022 · 5 revisions

Overview of sections:

  1. LOCAL DATA HANDLING - where your files will be locally
    • REBUILD_ALL_TIME_DATA_LOCALLY=1 triggers local data (data_from_s3) deletion and redownload in (./s3_data_download.sh)
  2. TIMEFRAME - LOAD_ALL_TIME=1 and LOAD_NEW_QUARTER=Q1 and LOAD_NEW_YEAR=2022 are defined here, ie Q1 and YYYY.
  3. LOCAL LOG - Give your local log a name in this suggested format LOG-MONTH-YEAR-load.txt ie LOG-July-2022-load.txt
  4. ORANGE BOOK - Orange book data download link location
  5. AWS - s3 bucket information ($aws configure list command should return same values)
  6. DATABASE - main postgres database config vars
  7. PENTAHO LOG DATABASE - name the table that will reside in your main database's public schema
  8. COMPARISON DATABASE - if you have an older database to compare against, define it here, otherwise repeat main db's configuration to prevent errors

ie: ${BASE_FILE_DIR} would be the '/path/to/your' of '/path/to/your/repo' (repo's parent directory)

First define BASE_FILE_DIR as a parameter within the job properties dialog window in the stage_0 job

image ^change Default value from "/path/to/repos/parent/directory" to the directory path

Run stage_0_set_pentaho_vars.kjb by clicking the run button ▶︎ !

image

  • Note: This must be done after setting up your values in your faers_config.config and placing it in your parent directory:
  • It must be successful before you can move on
    • If it fails on first run, so run twice

Stage 1 Wiki