-
Notifications
You must be signed in to change notification settings - Fork 5
Runing Single Cell Expression Atlas
- Java 11 (OpenJDK)
- Java 8
- Gradle 5.x
- Tomcat 8 (or any other Java 7 EE web server)
- Solr 7.1 + ZooKeeper 3.4.10
- PostgreSQL 10 (via Docker)
The application is split into two modules:
- Atlas Web Core: business logic shared by both (bulk) Expression Atlas and Single Cell Expression Atlas
- Atlas Web Single Cell: logic specific to single cell experiments and the web layer
There are other helper repositories configured as Git submodules such as different Gradle profiles we share across projects (i.e. development, testing and production), some front-end packages shared with bulk Expression Atlas that need some tweaks in each project and relational SQL schemas to create the testing in-memory H2 database. More details can be found in the .gitmodules
file.
Create an atlas
directory and clone both repos:
mkdir atlas
cd atlas
git clone --recurse-submodules https://github.com/ebi-gene-expression-group/atlas-web-core.git
git clone --recurse-submodules https://github.com/ebi-gene-expression-group/atlas-web-single-cell.git
IMPORTANT: atlas-web-single-cell
needs for atlas-web-core
to be installed in the same path as specified in settings.gradle
.
In order to ensure the sanity of the stack, it’s a good idea to run unit tests (by convention we append the suffix Test
to our unit tests and IT
to our integration tests). We’ll need Java 11:
cd atlas-web-core
./gradlew -PtestResultsPath=ut test --tests *Test
cd atlas-web-single-cell
./gradlew -PtestResultsPath=ut test --tests *Test
You should see the following in the last lines of Gradle’s output in either case:
...
BUILD SUCCESSFUL in 28s
5 actionable tasks: 5 executed
<-------------> 0% WAITING
In order to run the application we’ll need to prepare the data files, PostgreSQL and Solr.
The web application requires some files that live outside the classpath at startup (e.g. properties of supported species) and others over the lifetime of the application (e.g. experiment files). All the critical paths are defined in uk.ac.ebi.atlas.configuration.BasePathsConfig
and uk.ac.ebi.atlas.SingleCellFilePathConfig
(in atlas-web-core
and `atlas-web-single-cell, respectively).
The data files needed for the application to run are expected to be located at $HOME/ATLAS3.TEST/integration-test-data
as specified in Gradle’s development environment file, profile-dev.gradle
. This can be changed if it doesn’t suit your setup. A test data bundle can be downloaded from http://ftp.ebi.ac.uk/pub/databases/microarray/data/atlas/test/integration-test-data/. We recommend lftp
for this, as it’s got a mirror
command:
lftp ftp.ebi.ac.uk/pub/databases/microarray/data/atlas/test/integration-test-data/ -e 'mirror . $HOME/ATLAS3.TEST/integration-test-data'
Some of the downloaded data won’t be used by Single Cell Expression Data, since we keep data for both bulk and single cell experiments together. At the moment there are no plans to split them.
Download the pre-loaded PG data archive from http://ftp.ebi.ac.uk/pub/databases/microarray/data/atlas/test/pgdata-scxa.tgz. Uncompress it to a location of your choice and create a Docker container mounting the directory on /var/lib/postgresql/data/pgdata
:
docker run --name scxa-pg10 -e POSTGRES_USER=atlas3dev -e POSTGRES_PASSWORD=atlas3dev -e POSTGRES_DB=gxpatlasloc -d -p 5432:5432 -e PGDATA=/var/lib/postgresql/data/pgdata -v <the-dir-where-you-extracted-pg-archive>:/var/lib/postgresql/data/pgdata postgres:10.4-alpine
WARNING: In order to reduce the load time of datasets into the DB, there are some Postgres-specific settings we’re using such as table partitioning and disabling WALs. A forceful or regular termination of the Docker process may cause data integrity issues and full tables to be cleared when you restart your container. To avoid this, remember to alway stop the Postgres process manually before stopping Docker:
docker exec -it scxa-pg10 bash -c "su -c 'pg_ctl stop -m fast' postgres"
docker stop scxa-pg10
IMPORTANT: The database connection credentials must match the properties in profile-dev.gradle
, which are used to filter the jdbc.properties
file. If you can’t expose port 5432 in your host machine modify uk.ac.ebi.atlas.configuration.JdbcConfig
.
In this section we’ll create a SolrCloud instance composed of a one-node ZooKeeper ensemble and two Solr nodes. ZooKeeper listens by default to port 2181 and the Solr nodes will be listening to ports 8983 and 8984.
Download Solr 7.1.0 and ZooKeeper 3.4.10. Both require Java 8, so for this section point JAVA_HOME
at a suitable JDK/JRE and have its bin
directory in your path. Lastly, download the prepopulated Single Cell Expression Atlas SolrCloud collections from http://ftp.ebi.ac.uk/pub/databases/microarray/data/atlas/test/solrcloud.tgz.
Create a solr
directory anywhere on your filesystem and extract the Solr and ZooKeeper binaries, and the prepopulated SolrCloud collections. We want the directory structure below:
├── solr-7.1.0 # Solr 7.1.0 binary
├── zookeeper-3.4.10 # ZooKeeper 3.4.10 binary
└── solrcloud # Contents of solrcloud.tgz
├── node1
├── node2
└── zk
The only change needed to correctly run ZooKeeper is to edit the dataDir
property in line 11 of solrcloud/zk/zoo.cfg
, so that it contains the absolute path of solrcloud/zk/data
. Save the changes and run ZooKeeper:
ZOO_LOG_DIR=./solrcloud/zk/log ./zookeeper-3.4.10/bin/zkServer.sh start ./solrcloud/zk/zoo.cfg
If everything went well, you should see the following:
ZooKeeper JMX enabled by default
Using config: ./solrcloud/zk/zoo.cfg
Starting zookeeper ... STARTED
To further check that ZooKeeper is running properly, run the following:
echo "stat" | nc localhost 2181
And you should see something like this:
Zookeeper version: 3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
Clients:
/0:0:0:0:0:0:0:1:43772[0](queued=0,recved=1,sent=0)
Latency min/avg/max: 0/0/0
Received: 1
Sent: 0
Connections: 1
Outstanding: 0
Zxid: 0x4c0ed
Mode: standalone
Node count: 465
Now it’s time to start the Solr cores:
SOLR_LOGS_DIR=./solrcloud/node1/log ./solr-7.1.0/bin/solr start -c -s ./solrcloud/node1 -p 8983 -m 2g -z localhost:2181 -Denable.runtime.lib=true
SOLR_LOGS_DIR=./solrcloud/node2/log ./solr-7.1.0/bin/solr start -c -s ./solrcloud/node2 -p 8984 -m 2g -z localhost:2181 -Denable.runtime.lib=true
A successful start of the Solr processes above will display a message like:
Waiting up to 180 seconds to see Solr running on port 8983 [|]
Started Solr server on port 8983 (pid=300471). Happy searching!
If everything went well you should be able to open Solr’s admin web UI on localhost:8983
. Click on the collections dropdown to the left and check they’re all there:
Generate the WAR of the application:
cd atlas-web-single-cell
./gradlew war
You will find it in atlas-web-single-cell/build/libs/sc.war
. For both development and production we use Tomcat 8 but in principle any javax.servlet
-implementing web server will do. If you use a different web server than Tomcat 8 we’d be very interested in knowing your outcome. Additionally, there are many IDEs that can automate this step, and the scope for such workflows fall outside this guide.