Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
…p/atlas-web-bulk into feature/exp-design-service
  • Loading branch information
ke4 committed Oct 4, 2023
2 parents 17a8b4b + e03e938 commit efb6482
Show file tree
Hide file tree
Showing 31 changed files with 582 additions and 516 deletions.
113 changes: 25 additions & 88 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,14 @@

### TL;DR
```bash
./docker/prepare-dev-environment/gradle-cache/run.sh -l gradle-cache.log && \
./docker/prepare-dev-environment/volumes/run.sh -l volumes.log && \
./docker/prepare-dev-environment/postgres/run.sh -l pg.log && \
./docker/prepare-dev-environment/solr/run.sh -l solr.log
./docker/prepare-dev-environment/gradle-cache/run.sh -r -l gradle-cache.log && \
./docker/prepare-dev-environment/volumes/run.sh -r -l volumes.log && \
./docker/prepare-dev-environment/postgres/run.sh -r -l pg.log && \
./docker/prepare-dev-environment/solr/run.sh -r -l solr.log
```

### Requirements
- Docker v19+
- Docker Compose v1.25+
- Docker v20+ with the [Compose plugin](https://docs.docker.com/compose/install/)
- 100 GB of available storage for the following Docker volumes:
- Experiment files
- Bioentity properties (i.e. gene annotations)
Expand All @@ -27,22 +26,22 @@ initial state. You can find the volume names used by each service in the `volume
file.

The full list of volumes is:
- `gxa-atlas-data-bioentity-properties`
- `gxa-atlas-data-gxa`
- `gxa-atlas-data-gxa-expdesign`
- `gxa-gradle-ro-dep-cache`
- `gxa-gradle-wrapper-dists`
- `gxa-pgdata`
- `gxa-solrcloud-1-data`
- `gxa-solrcloud-2-data`
- `gxa-tomcat-conf`
- `gxa-webapp-properties`
- `gxa-zk-1-data`
- `gxa-zk-1-datalog`
- `gxa-zk-2-data`
- `gxa-zk-2-datalog`
- `gxa-zk-3-data`
- `gxa-zk-3-datalog`
- `gxa_atlas-data-bioentity-properties`
- `gxa_atlas-data-gxa`
- `gxa_atlas-data-gxa-expdesign`
- `gxa_gradle-ro-dep-cache`
- `gxa_gradle-wrapper-dists`
- `gxa_pgdata`
- `gxa_solrcloud-1-data`
- `gxa_solrcloud-2-data`
- `gxa_zk-1-data`
- `gxa_zk-1-datalog`
- `gxa_zk-2-data`
- `gxa_zk-2-datalog`
- `gxa_zk-3-data`
- `gxa_zk-3-datalog`
- `gxa_tomcat-conf`
- `gxa_webapp-properties`

### Code
Clone the repository of Bulk Expression Atlas with submodules:
Expand All @@ -59,7 +58,7 @@ If you have already cloned the project ensure it’s up-to-date:
To speed up builds and tests it is strongly encouraged to create a Docker volume to back a [Gradle read-only dependency
cache](https://docs.gradle.org/current/userguide/dependency_resolution.html#sub:ephemeral-ci-cache).
```bash
./docker/prepare-dev-environment/gradle-cache/run.sh -l gradle-cache.log
./docker/prepare-dev-environment/gradle-cache/run.sh -r -l gradle-cache.log
```

### Prepare volumes
Expand All @@ -68,7 +67,7 @@ volumes first. They will be populated with data that will be indexed in Solr and
needs all three of: file bundles in the volumes, Solr collections and Postgres data. This step takes care of the first
requirement:
```bash
./docker/prepare-dev-environment/volumes/run.sh -l volumes.log
./docker/prepare-dev-environment/volumes/run.sh -r -l volumes.log
```

You can get detailed information about which volumes are created if you run the script with the `-h` flag.
Expand All @@ -83,14 +82,14 @@ Ontoloy, Plant Ontology or InterPro.

To create our PostGreSQL database and run the schema migrations up to the latest version please execute this script:
```bash
./docker/prepare-dev-environment/postgres/run.sh -l pg.log
./docker/prepare-dev-environment/postgres/run.sh -r -l pg.log
```

### Solr
To create the collections, their schemas and populate them, please run the following script.

```bash
./docker/prepare-dev-environment/solr/run.sh -l solr.log
./docker/prepare-dev-environment/solr/run.sh -r -l solr.log
```

Run the script with the `-h` flag for more details.
Expand Down Expand Up @@ -299,46 +298,6 @@ down
```



## Backing up your data
Eventually you’ll add new experiments to your development instance of GXA, or new, improved collections in Solr will
replace the old ones. In such cases you’ll want to get a snapshot of the data to share with the team. Below there are
instructions to do that.

### PostgreSQL
If at some point you wish to create a backup dump of the database run the command below:
```bash
docker exec -it gxa-postgres bash -c 'pg_dump -d $POSTGRES_DB -h localhost -p 5432 -U $POSTGRES_USER -f /var/backups/postgresql/pg-dump.bin -F c -n $POSTGRES_USER -t $POSTGRES_USER.* -T *flyway*'
```

### SolrCloud

> **Warning!**
>
> **_!!! THIS SECTION IS OUTDATED. NEEDS TO BE UPDATED WITH AUTHENTICATION TO WORK WITH SOLR VERSION 8._**
```bash
for SOLR_COLLECTION in $SOLR_COLLECTIONS
do
START_DATE_IN_SECS=`date +%s`
curl "http://localhost:8983/solr/${SOLR_COLLECTION}/replication?command=backup&location=/var/backups/solr&name=${SOLR_COLLECTION}"

# Pattern enclosed in (?<=) is zero-width look-behind and (?=) is zero-width look-ahead, we match everything in between
COMPLETED_DATE=`curl -s "http://localhost:8983/solr/${SOLR_COLLECTION}/replication?command=details" | grep -oP '(?<="snapshotCompletedAt",").*(?=")'`
COMPLETED_DATE_IN_SECS=`date +%s -d "${COMPLETED_DATE}"`

# We wait until snapshotCompletedAt is later than the date we took before issuing the backup operation
while [ ${COMPLETED_DATE_IN_SECS} -lt ${START_DATE_IN_SECS} ]
do
sleep 1s
COMPLETED_DATE=`curl -s "http://localhost:8983/solr/${SOLR_COLLECTION}/replication?command=details" | grep -oP '(?<="snapshotCompletedAt",").*(?=")'`
COMPLETED_DATE_IN_SECS=`date +%s -d "${COMPLETED_DATE}"`
done
done
```



## Troubleshooting

### SolrCloud nodes shut down on macOS
Expand All @@ -347,25 +306,3 @@ memory you need to increase the available amount in the Docker Dashboard. For bu
to between 8-12 GB and disk image to 100 GB or more. Please see the screenshot below for reference:

![Screenshot-2021-02-18-at-18-27-40](https://user-images.githubusercontent.com/4425744/109644570-8ccee680-7b4d-11eb-9db0-7a29fb4d9e2b.png)

### I’m not getting any suggestions in Epression Atlas
Read the important message after you run `gxa-solrlcoud-bootstrap`:
> PLEASE READ!
> Suggesters haven’t been built because it’s very likely to get a `java.net.SocketTimeoutException` due
> to the size of the bioentities collection. Raising the timeout in Jetty could mask other errors down
> the line, and ignoring the exception doesn’t guarantee the suggester to be fully built since it still
> takes a few extra minutes: the exception is thrown before the process has completed.
> The best option is to manually build and supervise this step.
>
> On one terminal session run the following command (don’t worry if the request returns a 500 error):
>
> `docker exec -i gxa-solrcloud-1 curl 'http://localhost:8983/solr/bioentities-v1/suggest?suggest.build=true&suggest.dictionary=propertySuggester'`
>
> On another terminal, monitor the size of the suggester directory size:
>
> `docker exec -it gxa-solrcloud-1 bash -c 'watch du -sc server/solr/bioentities-v1*/data/*'`
> `docker exec -it gxa-solrcloud-2 bash -c 'watch du -sc server/solr/bioentities-v1*/data/*'`
>
> The suggester will be built when the propertySuggester directory size stabilises.
> Run the above procedure for each of your SolrCloud containers.
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ repositories {
// unless we’re on the VPN, in which case the build fails
mavenLocal()
maven {
url "http://45.88.81.176/artifactory/maven-local/"
url "http://45.88.81.166/artifactory/maven-local/"
allowInsecureProtocol true
}
}
Expand Down
27 changes: 16 additions & 11 deletions docker/dev.env
Original file line number Diff line number Diff line change
@@ -1,18 +1,23 @@
GRADLE_WRAPPER_DISTS_VOL_NAME=gxa-gradle-wrapper-dists
GRADLE_RO_DEP_CACHE_VOL_NAME=gxa-gradle-ro-dep-cache
ATLAS_DATA_BIOENTITY_PROPERTIES_VOL_NAME=gxa-atlas-data-bioentity-properties
ATLAS_DATA_GXA_VOL_NAME=gxa-atlas-data-gxa
ATLAS_DATA_GXA_EXPDESIGN_VOL_NAME=gxa-atlas-data-gxa-expdesign
WEBAPP_PROPERTIES_VOL_NAME=gxa-webapp-properties
PROJECT_NAME=gxa

GRADLE_WRAPPER_DISTS_VOL_NAME=gradle-wrapper-dists
GRADLE_RO_DEP_CACHE_VOL_NAME=gradle-ro-dep-cache

ATLAS_DATA_BIOENTITY_PROPERTIES_VOL_NAME=atlas-data-bioentity-properties
ATLAS_DATA_EXP_VOL_NAME=atlas-data-exp

ATLAS_DATA_EXPDESIGN_VOL_NAME=atlas-data-expdesign
POSTGRES_HOST=gxa-postgres
POSTGRES_DB=gxpgxadev
POSTGRES_USER=atlasdev
POSTGRES_PASSWORD=atlasdev
SCHEMA_VERSION=latest

SOLR_CLOUD_ZK_CONTAINER_1_NAME=gxa-solrcloud-zookeeper-0
SOLR_CLOUD_ZK_CONTAINER_2_NAME=gxa-solrcloud-zookeeper-1
SOLR_CLOUD_ZK_CONTAINER_3_NAME=gxa-solrcloud-zookeeper-2
SOLR_CLOUD_CONTAINER_1_NAME=gxa-solrcloud-0
SOLR_CLOUD_CONTAINER_2_NAME=gxa-solrcloud-1
SOLR_CLOUD_ZK_CONTAINER_1_NAME=solrcloud-zookeeper-0
SOLR_CLOUD_ZK_CONTAINER_2_NAME=solrcloud-zookeeper-1
SOLR_CLOUD_ZK_CONTAINER_3_NAME=solrcloud-zookeeper-2
SOLR_CLOUD_CONTAINER_1_NAME=solrcloud-0
SOLR_CLOUD_CONTAINER_2_NAME=solrcloud-1

SOLR_USER=solr
SOLR_PASSWORD=SolrRocks
57 changes: 18 additions & 39 deletions docker/docker-compose-gradle.yml
Original file line number Diff line number Diff line change
@@ -1,63 +1,42 @@
version: "3.6"

services:
gxa-gradle:
gradle:
image: gradle:7.0-jdk11
container_name: gxa-gradle
networks:
- atlas-test-net
ports:
- "5005:5005"
working_dir: /root/project
volumes:
- ..:/root/project
- gradle-wrapper-dists:/root/.gradle/wrapper/dists
- gradle-ro-dep-cache:/gradle-ro-dep-cache:ro
- bioentity-properties:/atlas-data/bioentity_properties:ro
- gxa-data:/atlas-data/gxa:ro
- gxa-expdesign:/atlas-data/expdesign
- ./packages:/root/.m2
- exp:/atlas-data/exp:ro
- expdesign:/atlas-data/expdesign
depends_on:
- gxa-solrcloud-0
- gxa-solrcloud-1
- gxa-flyway-test
solrcloud-0:
condition: service_started
solrcloud-1:
condition: service_started
flyway:
condition: service_completed_successfully
environment:
POSTGRES_HOST: $POSTGRES_HOST
POSTGRES_DB: $POSTGRES_DB
POSTGRES_USER: $POSTGRES_USER
POSTGRES_PASSWORD: $POSTGRES_PASSWORD
GRADLE_RO_DEP_CACHE: /gradle-ro-dep-cache
command:
- sh
- -c
- >
gradle :app:clean &&
gradle
-PdataFilesLocation=/atlas-data
-PexperimentFilesLocation=/atlas-data/gxa
-PjdbcUrl=jdbc:postgresql://$POSTGRES_HOST:5432/$POSTGRES_DB
-PjdbcUsername=$POSTGRES_USER
-PjdbcPassword=$POSTGRES_PASSWORD
-PzkHost=gxa-zk-1
-PsolrHost=gxa-solrcloud-1
:app:testClasses &&
gradle -PtestResultsPath=ut :app:test --tests *Test &&
gradle -PtestResultsPath=it -PexcludeTests=**/*WIT.class :app:test --tests *IT &&
gradle -PtestResultsPath=e2e :app:test --tests *WIT &&
gradle :app:jacocoTestReport
working_dir: /root/project

volumes:
gradle-wrapper-dists:
name: ${GRADLE_WRAPPER_DISTS_VOL_NAME}
name: ${PROJECT_NAME}_${GRADLE_WRAPPER_DISTS_VOL_NAME}
gradle-ro-dep-cache:
name: ${GRADLE_RO_DEP_CACHE_VOL_NAME}
name: ${PROJECT_NAME}_${GRADLE_RO_DEP_CACHE_VOL_NAME}
bioentity-properties:
name: ${ATLAS_DATA_BIOENTITY_PROPERTIES_VOL_NAME}
gxa-data:
name: ${ATLAS_DATA_GXA_VOL_NAME}
gxa-expdesign:
name: ${ATLAS_DATA_GXA_EXPDESIGN_VOL_NAME}
name: ${PROJECT_NAME}_${ATLAS_DATA_BIOENTITY_PROPERTIES_VOL_NAME}
exp:
name: ${PROJECT_NAME}_${ATLAS_DATA_EXP_VOL_NAME}
expdesign:
name: ${PROJECT_NAME}_${ATLAS_DATA_EXPDESIGN_VOL_NAME}

networks:
atlas-test-net:
name: atlas-test-net
name: atlas-test-net
14 changes: 6 additions & 8 deletions docker/docker-compose-postgres-test.yml
Original file line number Diff line number Diff line change
@@ -1,24 +1,22 @@
version: "3.6"

services:
gxa-postgres-test:
postgres-test:
image: postgres:11-alpine
container_name: $POSTGRES_HOST
container_name: ${POSTGRES_HOST}
networks:
- atlas-test-net
restart: always
command: -c max_wal_size=1GB
ports:
- "5432:5432"
environment:
POSTGRES_HOST: $POSTGRES_HOST
POSTGRES_USER: $POSTGRES_USER
POSTGRES_PASSWORD: $POSTGRES_PASSWORD
POSTGRES_DB: $POSTGRES_DB
- POSTGRES_PASSWORD
- POSTGRES_USER
- POSTGRES_DB

gxa-flyway-test:
flyway:
image: flyway/flyway
container_name: gxa-flyway-test
networks:
- atlas-test-net
command: [
Expand Down
21 changes: 10 additions & 11 deletions docker/docker-compose-postgres.yml
Original file line number Diff line number Diff line change
@@ -1,24 +1,23 @@
version: "3.6"

services:
gxa-postgres:
postgres:
container_name: ${POSTGRES_HOST}
image: postgres:11-alpine
environment:
POSTGRES_DB: ${POSTGRES_DB}
POSTGRES_USER: ${POSTGRES_USER}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
networks:
- atlas-test-net
restart: always
command: -c max_wal_size=2GB
ports:
- "5432:5432"
volumes:
- gxa-pgdata:/var/lib/postgresql/data
- pgdata:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD
- POSTGRES_USER
- POSTGRES_DB

gxa-flyway:
container_name: gxa-flyway
flyway:
image: flyway/flyway
networks:
- atlas-test-net
Expand All @@ -33,11 +32,11 @@ services:
volumes:
- ../schemas/flyway/gxa/:/flyway/sql
depends_on:
- gxa-postgres
- postgres

volumes:
gxa-pgdata:
name: gxa-pgdata
pgdata:
name: ${PROJECT_NAME:?err}_pgdata

networks:
atlas-test-net:
Expand Down
Loading

0 comments on commit efb6482

Please sign in to comment.