-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor load_mongodump.sh script #150
Conversation
…ebugging information, and integration with Docker Compose configuration file for database host configuration.
…/DB_NAME. Add docker-compose.dev.yml into gitignore. Update README.md file with information about not to make changes into docker-compose.dev.yml and update about forced push if changes are made to docker-compose.dev.yml.
Test Scenario and execution: Tried testing with four different scenarios with A. Copied the cortezebikes dataset under the em-public-dashboard directory. There was pre-existing openpath_prod_cortezebikes database in the MongoDB. [Some issues]
Execution and error log:``` ge_updateable_models index: _id_ dup key: { _id: ObjectId('6562e2cd9627a7b9fa5b80f4') } 2024-09-10T16:51:27.149+0000 continuing through error: E11000 duplicate key error collection: openpath_prod_cortezebikes.Stage_updateable_models index: _id_ dup key: { _id: ObjectId('6562e2cf9627a7b9fa5b80f5') } 2024-09-10T16:51:27.149+0000 continuing through error: E11000 duplicate key error collection: openpath_prod_cortezebikes.Stage_updateable_models index: _id_ dup key: { _id: ObjectId('6562e2d19627a7b9fa5b80f6') } 2024-09-10T16:51:27.149+0000 continuing through error: E11000 duplicate key error collection: openpath_prod_cortezebikes.Stage_updateable_models index: _id_ dup key: { _id: ObjectId('65643417ac99012d039e6f41') } 2024-09-10T16:51:27.152+0000 [########################] openpath_prod_cortezebikes.Stage_updateable_models 2.45GB/2.45GB (100.0%) 2024-09-10T16:51:27.152+0000 restoring indexes for collection openpath_prod_cortezebikes.Stage_updateable_models from metadata 2024-09-10T16:51:32.848+0000 finished restoring openpath_prod_cortezebikes.Stage_updateable_models (5 documents, 542 failures) 2024-09-10T16:54:36.468+0000 finished restoring openpath_prod_cortezebikes.Stage_timeseries (1070889 documents, 2427000 failures) 2024-09-10T16:54:36.476+0000 1663213 document(s) restored successfully. 2460158 document(s) failed to restore. Database restore complete. ashrest2-35384s:em-public-dashboard ashrest2$ ``` B. Copied vail dataset under the em-public-dashboard directory. The stage database was empty. [Execution successful]
Execution log:2024-09-10T17:11:01.450+0000 [######################..] Stage_database.Stage_timeseries 2.05GB/2.16GB (94.9%) 2024-09-10T17:11:04.450+0000 [#######################.] Stage_database.Stage_timeseries 2.14GB/2.16GB (98.7%) 2024-09-10T17:11:06.504+0000 [########################] Stage_database.Stage_timeseries 2.16GB/2.16GB (100.0%) 2024-09-10T17:11:06.504+0000 restoring indexes for collection Stage_database.Stage_timeseries from metadata 2024-09-10T17:12:40.880+0000 finished restoring Stage_database.Stage_timeseries (3165925 documents, 0 failures) 2024-09-10T17:12:40.883+0000 3485167 document(s) restored successfully. 0 document(s) failed to restore. Database restore complete. ashrest2-35384s:em-public-dashboard ashrest2$ C. Attempt to reload the script with
Execution log:``` ashrest2-35384s:em-public-dashboard ashrest2$ bash viz_scripts/docker/load_mongodump.sh vail_2022-05-09.tar.gz Script Directory: viz_scripts/docker Configuration File Path: viz_scripts/docker/../../docker-compose.dev.yml MongoDump File Path: vail_2022-05-09.tar.gz Configuration file details: -rw-r--r-- 1 ashrest2 NREL_NT\Domain Users 1019 Sep 10 10:05 viz_scripts/docker/../../docker-compose.dev.yml ``` This worked fine. D. Attempt to reload the script with
Execution log:``` 2024-09-10T17:49:25.126+0000 [########################] openpath_prod_cortezebikes.Stage_updateable_models 2.45GB/2.45GB (100.0%) 2024-09-10T17:49:28.126+0000 [########################] openpath_prod_cortezebikes.Stage_updateable_models 2.45GB/2.45GB (100.0%)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of quick notes! I have also tried this, it seemed to work well!
.gitignore
Outdated
@@ -132,3 +132,6 @@ dmypy.json | |||
|
|||
# Pyre type checker | |||
.pyre/ | |||
|
|||
# docker-compose-dev.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should both docker-compose files be excluded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's a good idea. I have excluded both docker-compose files to be excluded.
docker-compose.dev.yml
Outdated
@@ -25,7 +25,7 @@ services: | |||
depends_on: | |||
- db | |||
environment: | |||
- DB_HOST=db | |||
- DB_HOST=mongo://db/DB_NAME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we make this change in the other docker-compose file as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also needs to be mongodb
and not mongo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple more small changes!
.gitignore
Outdated
# docker-compose yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you get rid of this extra line please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I have removed the comment
docker-compose.dev.yml
Outdated
@@ -25,7 +25,7 @@ services: | |||
depends_on: | |||
- db | |||
environment: | |||
- DB_HOST=db |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be helpful to make these same changes in docker-compose.yml
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intent of using load_mongodump.sh
is to load the dataset into mongo db.
A couple of observations:
- We are loading the config file in
load_mongodump.sh
from docker-compose.dev.yml.
CONFIG_FILE="$SCRIPT_DIR/../../docker-compose.dev.yml"
- We would need to make some changes in
load_mongodump.sh
if we want to configure for bothdocker-compose.yml
anddocker-compose.dev.yml
. - We can anyways load the dataset with the current changes for the public dashboard db, do you think the additional changes would be a necessity? Moreover, it's implemented as in with op-admin dashboard.
Please let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thought was that the updated default with the formatting might make it easier to switch that dataset in use when editing the docker file (ie update from mongodb://db/DB_NAME
to mongodb://db/openpath_prod_open_acess
instead of from db
, but it sounds like there are more technical reasons that differ between the two files, so I'm ok with leaving the other dockerfile alone
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, users don't need to rename DB_HOST=mongodb://db/DB_NAME
manually; the script edits the file directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverting back to DB_HOST=db
such that it's consistent between docker-compose.dev.yml
and docker-compose.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, I don't understand why you reverted this change. I am also not sure how it works, given that load_mongodump.sh
expects a mongodb://
URL so that it can change the DB_NAME
appropriately, I bet that this was not tested, because I don't see how it could work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have documented my testing below #150 (comment)
In the load_mongodump.sh
:
# Update the docker-compose configuration file with the actual DB_HOST
DB_HOST="mongodb://db/$DB_NAME"
sed -i.bak "s|DB_HOST:.*|DB_HOST: $DB_HOST|" "$CONFIG_FILE"
I understand, it would assign mongdb://db$DB_NAME
to the configuration file.
I have also re-done testing and updated detailed logs below. #150 (comment)
@iantei Can you mark this as ready for review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A. Copied the cortezebikes dataset under the em-public-dashboard directory. There was pre-existing openpath_prod_cortezebikes database in the MongoDB. [Some issues]
As you know, this is not supposed to happen. Have you tried to debug this? What are the other logs that you saw, particularly around DB_NAME
?
docker-compose.dev.yml
Outdated
@@ -25,7 +25,7 @@ services: | |||
depends_on: | |||
- db | |||
environment: | |||
- DB_HOST=db |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, users don't need to rename DB_HOST=mongodb://db/DB_NAME
manually; the script edits the file directly.
viz_scripts/docker/load_mongodump.sh
Outdated
# Directory of the script | ||
SCRIPT_DIR="$(dirname "$0")" | ||
|
||
# Path to the configuration file (one level up) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should be two levels up
viz_scripts/docker/load_mongodump.sh
Outdated
if [ "$#" -ne 1 ]; then | ||
echo "Usage: $0 <mongodump-file>" | ||
echo " <mongodump-file> : The path to the MongoDB dump file to be restored." | ||
echo " run git add -f <docker compose file> after using this command" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this since it contradicts the README instructions. I know it was in the original PR, but was removed in response to review feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@iantei the entire if check does not contradict the README, only the last echo
. I will fix this before merging so that we don't have to wait for yet another round of reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have re-introduced the check while removing the last echo
statement.
This was my first time running the script. |
I tried running the script again, under the same condition for the database -
|
@iantei it looks like you have addressed the new review comments, can you mark it as not a draft so it is ready for review again? |
With the
Details re-testing for load_mongodump.sh script
|
Re-executed the load_mongodump.sh script after the last commit.
Looks good |
The config for docker-compose.dev.yml in public-dashboard is DB_HOST=xx than DB_HOST:xx in op-admin dashboard. |
Final testing:
Looks good |
@iantei the testing here is not just if the Having said that, I see what you mean by the sed command around In a future commit, I think it would be helpful to unify the way that the environment variables are represented in the public and admin dashboards, but we can deal with that as part of e-mission/e-mission-docs#1082 |
Please also indicate the diff between the scripts on the admin and public dashboards. I would not have caught the fact that the entire check in #150 (comment) had been removed if it is was not flagged by a comment. |
In short, 458e0f8 handles the task of assigning right DB_HOST to the docker-compose.dev.yml file Detailed testing scenario:
Execution steps:
Detailed execution of load_mongodump.sh script:
STUDY_CONFIG was manually changed to
|
Known differences between the layout and details in admin and public dashboard for docker-compose and load_mongodump script: A. Different location of docker-compose.dev.yml and load_mongodump.sh files in the repo.
B. Name of the docker-compose.dev.yml files:
C. Use of
Difference in the load_mongodump script used in admin and the public dashboards: A. Directory level difference of files:
B. As depicted above there is difference in naming convention for .yml files. C. Difference in sed command: use of
I tried to tabulate the above information, but it wasn't working as expected. Therefore, I enlisted these instead. |
I was looking for the output of Regardless, this is an improvement on the current load script, and it works, so I am merging it in the spirit of incremental improvements. I would like to see the diff while starting the next round of cleanup. |
Changes updated in load_mongodump.sh identical to e-mission/op-admin-dashboard#122
Refactor changes:
Better error handling, debugging information and integration with Docker compose configuration file for database host configuration.