forked from ltscomputingllc/faersdbstats
-
Notifications
You must be signed in to change notification settings - Fork 4
Stage 0 Setting Pentaho Vars
wolfderby edited this page Sep 7, 2022
·
5 revisions
- LOCAL DATA HANDLING - where your files will be locally
- REBUILD_ALL_TIME_DATA_LOCALLY=1 triggers local data (data_from_s3) deletion and redownload in (./s3_data_download.sh)
- TIMEFRAME - LOAD_ALL_TIME=1 and LOAD_NEW_QUARTER=Q1 and LOAD_NEW_YEAR=2022 are defined here, ie Q1 and YYYY.
- LOCAL LOG - Give your local log a name in this suggested format LOG-MONTH-YEAR-load.txt ie LOG-July-2022-load.txt
- ORANGE BOOK - Orange book data download link location
- Navigate to https://www.fda.gov/drugs/drug-approvals-and-databases/orange-book-data-files
- Get CEM_ORANGE_BOOK_DOWNLOAD_URL value:
- Download the file and set CEM_ORANGE_BOOK_DOWNLOAD_FILENAME to name of file downloaded
- AWS - s3 bucket information (
$aws configure list
command should return same values) - DATABASE - main postgres database config vars
- PENTAHO LOG DATABASE - name the table that will reside in your main database's public schema
- COMPARISON DATABASE - if you have an older database to compare against, define it here, otherwise repeat main db's configuration to prevent errors
ie: ${BASE_FILE_DIR} would be the '/path/to/your' of '/path/to/your/repo' (repo's parent directory)
First define BASE_FILE_DIR as a parameter within the job properties dialog window in the stage_0 job
^change Default value from "/path/to/repos/parent/directory" to the directory path
- Note: This must be done after setting up your values in your faers_config.config and placing it in your parent directory:
- It must be successful before you can move on
- If it fails on first run, so run twice