From d0313bda5dd46a3b81988bdc378399163f66be51 Mon Sep 17 00:00:00 2001 From: Seth Shaw Date: Fri, 5 Jul 2024 17:05:17 -0700 Subject: [PATCH 1/8] work-in-progress --- .../installing-composer-drush-and-drupal.md | 85 +++++++++++------- .../installing-fedora-syn-and-blazegraph.md | 39 +++----- docs/installation/manual/installing-solr.md | 68 +++++--------- .../installing-tomcat-and-cantaloupe.md | 85 ++++++++++++------ docs/installation/manual/introduction.md | 27 ++++++ .../manual/preparing-a-webserver.md | 89 ++++++++++--------- 6 files changed, 221 insertions(+), 172 deletions(-) diff --git a/docs/installation/manual/installing-composer-drush-and-drupal.md b/docs/installation/manual/installing-composer-drush-and-drupal.md index 78f16e611..0982a0058 100644 --- a/docs/installation/manual/installing-composer-drush-and-drupal.md +++ b/docs/installation/manual/installing-composer-drush-and-drupal.md @@ -5,10 +5,19 @@ ## In this section, we will install: +- [cURL](https://curl.se/) is used by composer to efficiently download libraries - [Composer](https://getcomposer.org/) at its current latest version, the package manager that will allow us to install PHP applications - Either the [Islandora Starter Site](https://github.com/Islandora/islandora-starter-site/), or the [Drupal recommended-project](https://www.drupal.org/docs/develop/using-composer/starting-a-site-using-drupal-composer-project-templates#s-drupalrecommended-project), which will install, among other things: - - [Drush 10](https://www.drush.org/) at its latest version, the command-line PHP application for running tasks in Drupal - - [Drupal 9](https://www.drupal.org/) at its latest version, the content management system Islandora uses for content modelling and front-end display + - [Drush](https://www.drush.org/) at its latest version, the command-line PHP application for running tasks in Drupal + - [Drupal](https://www.drupal.org/) at its latest version, the content management system Islandora uses for content modelling and front-end display + +## Install cURL + +cURL may already be installed. Check by running `curl --version`. If it isn't, install it: + +```bash +sudo apt install curl +``` ## Install Composer @@ -30,14 +39,21 @@ At this point, you have the option of using the [Islandora Starter Site](https:/ and configurations which function "out of the box," or build a clean stock Drupal via the Drupal Recommended Project and install Islandora modules as you desire. +On a default Ubuntu install the `/var/www` directory is owned by root, but we want the webserver to control this space, so we'll give it ownership: + +```bash +sudo chown -R /var/www +``` + + ### Option 1: Create a project using the Islandora Starter Site -Navigate to the folder where you want to put your Islandora project (in our case `/var/www`), and -create the Islandora Starter Site: +Navigate to the folder where you want to put your Islandora project (in our case `/var/www/html`). You can give your site any name you like, but for these instructions we will simply call it "drupal": ```bash -cd /var/www -composer create-project islandora/islandora-starter-site +cd /var/www/html +sudo -u www-data mkdir drupal +sudo -u www-data composer create-project islandora/islandora-starter-site drupal ``` This will install all PHP dependencies, including Drush, and scaffold the site. @@ -46,12 +62,12 @@ Drush is not accessible via `$PATH`, but is available using the command `compose ### Option 2: Create a basic Drupal Recommended Project -Navigate to the folder where you want to put your Drupal project (in our case `/var/www`), and +Navigate to the folder where you want to put your Drupal project (in our case `/var/www/html`), and create the Drupal Recommended Project: ```bash -cd /var/www -composer create-project drupal/recommended-project my-project +cd /var/www/html +sudo -u www-data composer create-project drupal/recommended-project drupal ``` @@ -69,12 +85,13 @@ Listen 80 Remove everything but the "Listen 80" line. You can leave the comments in if you want. -`/etc/apache2/sites-enabled/000-default.conf | root:root/777` +Create a drupal virtual host: +`/etc/apache2/sites-available/islandora.conf | root:root/644` ```xml ServerName SERVER_NAME - DocumentRoot "/opt/drupal/web" - + DocumentRoot "/var/www/html/drupal/web" + Options Indexes FollowSymLinks MultiViews AllowOverride all Require all granted @@ -87,33 +104,30 @@ Remove everything but the "Listen 80" line. You can leave the comments in if you - `SERVER_NAME`: `localhost` - For a development environment hosted on your own machine or a VM, `localhost` should suffice. Realistically, this should be the domain or IP address the server will be accessed at. -Restart the Apache 2 service to apply these changes: +Set permissions and enable the virtual host: ```bash sudo systemctl restart apache2 +sudo a2ensite islandora.conf +sudo systemctl reload apache2 ``` -## Prepare the PostgreSQL database +## Prepare the MySQL database + +We're going to create a user in MySQL for this Drupal site. Then create a database that we can use to install Drupal. + +The following values can (and in the case of the password, *should*) be changed to local values. -PostgreSQL roles are directly tied to users. We’re going to ensure a user is in place, create a role for them in PostgreSQL, and create a database for them that we can use to install Drupal. +- `DRUPAL_DATABASE_NAME`: This will be used as the core database that Drupal is installed into +- `MYSQL_USER_FOR_DRUPAL`: Specifically, this is the user that will connect to the MySQL database being created, not the user that will be logging into Drupal +- `MYSQL_PASSWORD_FOR_DRUPAL`: This should be a secure password; it’s recommended to use a password generator to create this such as the one provided by [random.org](https://www.random.org/passwords/) ```bash -# Run psql as the postgres user, the only user currently with any PostgreSQL -# access. -sudo -u postgres psql -# Then, run these commands within psql itself: -create database DRUPAL_DB encoding 'UTF8' LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8' TEMPLATE template0; -create user DRUPAL_DB_USER with encrypted password 'DRUPAL_DB_PASSWORD'; -grant all privileges on database DRUPAL_DB to DRUPAL_DB_USER; -# Then, quit psql. -\q +sudo mysql -u root +CREATE DATABASE [DRUPAL_DATABASE_NAME]; +CREATE USER '[MYSQL_USER_FOR_DRUPAL]'@'localhost' IDENTIFIED BY '[MYSQL_PASSWORD_FOR_DRUPAL]'; +GRANT ALL PRIVILEGES ON [DRUPAL_DATABASE_NAME].* TO '[MYSQL_USER_FOR_DRUPAL]'@'localhost'; +exit ``` -- `DRUPAL_DB`: `drupal9` - - This will be used as the core database that Drupal is installed into -- `DRUPAL_DB_USER`: `drupal` - - Specifically, this is the user that will connect to the PostgreSQL database being created, not the user that will be logging into Drupal -- `DRUPAL_DB_PASSWORD`: `drupal` - - This should be a secure password; it’s recommended to use a password generator to create this such as the one provided by [random.org](https://www.random.org/passwords/) - ## Install Drupal using Drush @@ -122,13 +136,18 @@ The Drupal installation process can be done through the GUI in a series of form ### Option 1: Site install the Starter Site with existing configs Follow the instructions in the [README of the Islandora Starter Site](https://github.com/Islandora/islandora-starter-site/#usage). -The steps are not reproduced here to remove redundancy. When this installation is done, you'll have a starter site ready-to-go. Once you set up the external services in the next sections, you'll need to configure Drupal to know where they are. +The steps are not reproduced here to remove redundancy. But specifically, + +1. Configure the database connection information (see the section above) and fedora flysystem in `/var/www/html/drupal/web/sites/default/settings.php`. +2. Install the site using `sudo -u www-data composer exec -- drush site:install --existing-config`. + +When this installation is done, you'll have a starter site ready-to-go. Once you set up the external services in the next sections, you'll need to configure Drupal to know where they are. ### Option 2: Site install the basic Drupal Recommended Project ```bash cd /var/www/drupal -drush -y site-install standard --db-url="pgsql://DRUPAL_DB_USER:DRUPAL_DB_PASSWORD@127.0.0.1:5432/DRUPAL_DB" --site-name="SITE_NAME" --account-name=DRUPAL_LOGIN --account-pass=DRUPAL_PASS +sudo -u www-data drush -y site-install standard --db-url="mysql://MYSQL_USER_FOR_DRUPAL:MYSQL_PASSWORD_FOR_DRUPAL@127.0.0.1:3306/DRUPAL_DATABASE_NAME" --site-name="SITE_NAME" --account-name=DRUPAL_LOGIN --account-pass=DRUPAL_PASS ``` This uses the same parameters from the above step, as well as: diff --git a/docs/installation/manual/installing-fedora-syn-and-blazegraph.md b/docs/installation/manual/installing-fedora-syn-and-blazegraph.md index 4f1f6ebd7..0850a0f7d 100644 --- a/docs/installation/manual/installing-fedora-syn-and-blazegraph.md +++ b/docs/installation/manual/installing-fedora-syn-and-blazegraph.md @@ -31,11 +31,10 @@ sudo chown -R tomcat:tomcat /opt/fcrepo The method for creating the database here will closely mimic the method we used to create our database for Drupal. ```bash -sudo -u postgres psql -create database FEDORA_DB encoding 'UTF8' LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8' TEMPLATE template0; -create user FEDORA_DB_USER with encrypted password 'FEDORA_DB_PASSWORD'; -grant all privileges on database FEDORA_DB to FEDORA_DB_USER; -\q +sudo mysql -u root +CREATE DATABASE [FEDORA_DB]; +CREATE USER '[MYSQL_USER_FOR_FEDORA]'@'localhost' IDENTIFIED BY '[MYSQL_PASSWORD_FOR_FEDORA]'; +GRANT ALL PRIVILEGES ON [FEDORA_DB].* TO '[MYSQL_USER_FOR_FEDORA]'@'localhost'; ``` - `FEDORA_DB`: `fcrepo` @@ -156,9 +155,9 @@ fcrepo.jms.baseUrl=FCREPO_JMS_BASE * `FCREPO_DB_USERNAME` - The database username * `FCREPO_DB_PASSWORD` - The database password -* `FCREPO_OCFL_ROOT` - Sets the root directory of the OCFL. Defaults to `FCREPO_HOME/data/ocfl-root` if not set. -* `FCREPO_TEMP_ROOT` - Sets the temp directory used by OCFL. Defaults to `FCREPO_HOME/data/temp` if not set. -* `FCREPO_STAGING_ROOT` - Sets the staging directory used by OCFL. Defaults to `FCREPO_HOME/data/staging` if not set. +* `FCREPO_OCFL_ROOT` - Sets the root directory of the OCFL. Defaults to `FCREPO_HOME/data/ocfl-root` if this line is deleted. +* `FCREPO_TEMP_ROOT` - Sets the temp directory used by OCFL. Defaults to `FCREPO_HOME/data/temp` if this line is deleted. +* `FCREPO_STAGING_ROOT` - Sets the staging directory used by OCFL. Defaults to `FCREPO_HOME/data/staging` if this line is deleted. * `FCREPO_VELOCITY_LOG` - The Fedora HTML template code uses Apache Velocity, which generates a runtime log called velocity.log. Defaults to `FCREPO_HOME/logs/velocity`. A good choice might be /opt/tomcat/logs/velocity.log * `FCREPO_JMS_BASE` - This specifies the baseUrl to use when generating JMS messages. You can specify the hostname with or without port and with or without path. If your system is behind a NAT firewall you may need this to avoid your message consumers trying to access the system on an invalid port. If this system property is not set, the host, port and context from the user's request will be used in the emitted JMS messages. If your Alpaca is on the same machine as your Fedora and you use the `islandora-indexing-fcrepo`, you could use http://localhost:8080/fcrepo/rest. @@ -234,9 +233,8 @@ Once it starts up, Fedora REST API should be available at http://localhost:8080/ A compiled JAR of Syn can be found on the [Syn releases page](https://github.com/Islandora/Syn/releases). We’re going to add this to the list libraries accessible to Tomcat. ``` -sudo wget -P /opt/tomcat/lib SYN_JAR_URL +sudo -u tomcat wget -P /opt/tomcat/lib SYN_JAR_URL # Ensure the library has the correct permissions. -sudo chown -R tomcat:tomcat /opt/tomcat/lib sudo chmod -R 640 /opt/tomcat/lib ``` @@ -279,16 +277,16 @@ There are two options here: `/opt/tomcat/conf/context.xml` **Before**: -> 29 | `-->` +> 30 | `-->` -> 30 | `` +> 31 | `` **After**: -> 29 | `-->` +> 30 | `-->` -> 30 | `` +> 31 | `` -> 31 | `` +> 32 | `` #### 2. Enable the Syn Valve for only Fedora. @@ -515,16 +513,7 @@ com.bigdata.rdf.store.AbstractTripleStore.statementIdentifiers=false ### Specifying the `RWStore.properties` in `JAVA_OPTS` -In order to enable our configuration when Tomcat starts, we need to reference the location of `RWStore.properties` in the `JAVA_OPTS` environment variable that Tomcat uses. - -`/opt/tomcat/bin/setenv.sh` - -**Before**: -> 3 | export JAVA_OPTS="-Djava.awt.headless=true -Dfcrepo.config.file=/opt/fcrepo/config/fcrepo.properties -DconnectionTimeout=-1 -server -Xmx1500m -Xms1000m" - -**After**: -> 3 | export JAVA_OPTS="-Djava.awt.headless=true -Dfcrepo.config.file=/opt/fcrepo/config/fcrepo.properties -DconnectionTimeout=-1 -Dcom.bigdata.rdf.sail.webapp.ConfigParams.propertyFile=/opt/blazegraph/conf/RWStore.properties -Dlog4j.configuration=file:/opt/blazegraph/conf/log4j.properties -server -Xmx1500m -Xms1000m" - +In order to enable our configuration when Tomcat starts, we need to add the location of `RWStore.properties` to the existing `JAVA_OPTS` environment variable that Tomcat uses in `/opt/tomcat/bin/setenv.sh`: `-Dcom.bigdata.rdf.sail.webapp.ConfigParams.propertyFile=/opt/blazegraph/conf/RWStore.properties -Dlog4j.configuration=file:/opt/blazegraph/conf/log4j.properties` ### Restarting Tomcat diff --git a/docs/installation/manual/installing-solr.md b/docs/installation/manual/installing-solr.md index 99c448a51..ec3868adc 100644 --- a/docs/installation/manual/installing-solr.md +++ b/docs/installation/manual/installing-solr.md @@ -7,23 +7,23 @@ - [Apache Solr 8](https://lucene.apache.org/solr/), the search engine used to index and find Drupal content - [search_api_solr](https://www.drupal.org/project/search_api_solr), the Solr implementation of Drupal's search API -## Solr 8 +## Solr 9 ### Downloading and Placing Solr -The Solr binaries can be found at the [Solr downloads page](https://solr.apache.org/downloads.html); the most recent stable release of Solr 8 should be used. +The Solr binaries can be found at the [Solr downloads page](https://solr.apache.org/downloads.html); the most recent stable release of Solr 9 should be used. ```bash # While generally we download tarballs as .tar.gz files without version # information, the Solr installer is a bit particular in that it expects a .tgz # file with the same name as the extracted folder it contains. It's odd, and we # can't really get around it. -cd -wget SOLR_DOWNLOAD_LINK -tar -xzvf SOLR_TARBALL +cd /opt +sudo wget -O SOLR_TARBALL SOLR_DOWNLOAD_LINK +sudo tar -xzvf SOLR_TARBALL ``` - `SOLR_DOWNLOAD_LINK`: **NOTICE**: This will depend on a few different things, not least of all the current version of Solr. The link to the `.tgz` for the binary on the downloads page will take you to a list of mirrors that Solr can be downloaded from, and provide you with a preferred mirror at the top. This preferred mirror should be used as the `SOLR_DOWNLOAD_LINK`. -- `SOLR_TARBALL`: The filename that was downloaded, e.g., `solr-8.9.0.tgz` +- `SOLR_TARBALL`: The filename that was downloaded, e.g., `solr-9.6.1.tgz` ### Running the Solr Installer @@ -34,18 +34,19 @@ sudo UNTARRED_SOLR_FOLDER/bin/install_solr_service.sh SOLR_TARBALL ``` - `UNTARRED_SOLR_FOLDER`: This will likely simply be `solr-VERSION`, where `VERSION` is the version number that was downloaded. -The port that Solr runs on can potentially be configured at this point, but we'll expect it to be running on `8983`. +The installer will start the service for you. Check the status and stop and restart if needed: -Wait until the command output reaches: - -``` -Started Solr server on port 8983 (pid=****). Happy searching! -systemd[1]: Started LSB: Controls Apache Solr as a Service. +```bash +sudo systemctl status solr +sudo systemctl stop solr +sudo systemctl start solr ``` -After which you can press `q` to quit the output (this won't kill Solr so it's safe). +If you want to use the web dashboard (for development only) you can edit the `solr.in.sh` file to make it more accessible. + +Find `#SOLR_JETTY_HOST="127.0.0.1"` and change it to `SOLR_JETTY_HOST="0.0.0.0"`. (Note the lack of `#` now.) -You can check if Solr is running correctly by going to http://localhost:8983/solr +Restart Solr `sudo systemctl restart solr` and go to http://localhost:8983/solr. ### Increasing the Open File Limit (Optional) @@ -66,33 +67,12 @@ Then apply your new configuration. sudo sysctl -p ``` -### Creating a New Solr Core - -Initially, our new Solr core will contain a configuration copied from the example included with the installation, so that we have something to work with when we configure this on the Drupal side. We’ll later update this with generated configurations we create in Drupal. - -```bash -cd /opt/solr -sudo mkdir -p /var/solr/data/SOLR_CORE/conf -sudo cp -r example/files/conf/* /var/solr/data/SOLR_CORE/conf -sudo chown -R solr:solr /var/solr -sudo -u solr bin/solr create -c SOLR_CORE -p 8983 -``` -- `SOLR_CORE`: `islandora8` - -You should see an output similar to this: -``` -WARNING: Using _default configset with data driven schema functionality. NOT RECOMMENDED for production use. - To turn off: bin/solr config -c islandora8 -p 8983 -action set-user-property -property update.autoCreateFields -value false - -Created new core 'islandora8' -``` - ### Installing `search_api_solr` -Rather than use an out-of-the-box configuration that won’t be suitable for our purposes, we’re going to use the Drupal `search_api_solr` module to generate one for us. This will also require us to install the module so we can create these configurations using Drush. +Rather than use an out-of-the-box configuration that won’t be suitable for our purposes, we’re going to use the Drupal `search_api_solr` module to generate one for us. This module was already installed if you used the starter site, but you can install it if you didn't: ```bash -cd /opt/drupal +cd /var/www/html/drupal sudo -u www-data composer require drupal/search_api_solr:^4.2 drush -y en search_api_solr ``` @@ -109,7 +89,7 @@ The following module(s) will be enabled: search_api_solr, language, search_api ### Configuring search_api_solr -Before we can create configurations to use with Solr, the core we created earlier needs to be referenced in Drupal. +Before we can create configurations to use with Solr, the core we created earlier needs to be referenced in Drupal. Again, the starter site provides this already; but if you installed it yourself, the directions below should help. Log in to the Drupal site at `/user` using the sitewide administrator username and password (if using defaults from previous chapters this should be `islandora` and `islandora`), then navigate to `/admin/config/search/search-api/add-server`. @@ -121,7 +101,7 @@ Fill out the server addition form using the following options: ![Setting the Solr Install Directory](../../assets/setting_the_solr_install_directory.png) -- `SERVER_NAME`: `islandora8` +- `SERVER_NAME`: `islandora` - This is completely arbitrary, and is simply used to differentiate this search server configuration from all others. **Write down** or otherwise pay attention to the `machine_name` generated next to the server name you type in; this will be used in the next step. As a recap for this configuration: @@ -148,17 +128,17 @@ Click **Save** to create the server configuration. Now that our core is in place and our Drupal-side configurations exist, we’re ready to generate Solr configuration files to connect this site to our search engine. ```bash -cd /opt/drupal -drush solr-gsc SERVER_MACHINE_NAME /opt/drupal/solrconfig.zip +cd /var/www/html/drupal +drush solr-gsc SERVER_MACHINE_NAME solrconfig.zip unzip -d ~/solrconfig solrconfig.zip -sudo cp ~/solrconfig/* /var/solr/data/SOLR_CORE/conf +sudo -u solr /opt/solr/bin/solr create_core -c SOLR_CORE -d ~/solrconfig -n SOLR_CORE sudo systemctl restart solr ``` -- `SERVER_MACHINE_NAME`: This should be the `machine_name` that was automatically generated when creating the configuration in the above step. +- `SERVER_MACHINE_NAME`: This should be the `machine_name` that was automatically generated when creating the configuration in the above step. The starter site uses `default_solr_server`. ### Adding an Index -In order for content to be indexed back into Solr, a search index needs to be added to our server. Navigate to `/admin/config/search/search-api/add-index` and check off the things you'd like to be indexed. +The site template provides an index for us; but if you didn't use it, you need to set up your index configuration. Navigate to `/admin/config/search/search-api/add-index` and check off the things you'd like to be indexed. **NOTICE** You should come back here later and reconfigure this after completing the last step in this guide. The default indexing configuration is pretty permissive, and you may want to restrict, for example, indexed content to just Islandora-centric bundles. This guide doesn't set up the index's fields either, which are going to be almost wholly dependent on the needs of your installation. Once you complete that configuration later on, re-index Solr from the configuration page of the index we're creating here. diff --git a/docs/installation/manual/installing-tomcat-and-cantaloupe.md b/docs/installation/manual/installing-tomcat-and-cantaloupe.md index f3c8a515d..1af119e42 100644 --- a/docs/installation/manual/installing-tomcat-and-cantaloupe.md +++ b/docs/installation/manual/installing-tomcat-and-cantaloupe.md @@ -4,28 +4,12 @@ The manual installation documentation is in need of attention. We are aware that some components no longer work as documented here. If you are interested in helping us improve the documentation, please see [Contributing](../../../contributing/CONTRIBUTING). ## In this section, we will install: + - [Tomcat 9](https://tomcat.apache.org/download-90.cgi), the Java servlet container that will serve up some Java applications on various endpoints, including, importantly, Fedora - [Cantaloupe 5](https://cantaloupe-project.github.io/), the image tileserver - running in Tomcat - that will be used to serve up large images in a web-accessible fashion ## Tomcat 9 -### Installing OpenJDK 11 - -Tomcat runs in a Java runtime environment, so we'll need one to continue. In our case, OpenJDK 11 is open-source, free to use, and can fairly simply be installed using `apt-get`: - -```bash -sudo apt-get -y install openjdk-11-jdk openjdk-11-jre -``` - -The installation of OpenJDK via `apt-get` establishes it as the de-facto Java runtime environment to be used on the system, so no further configuration is required. - -The resultant location of the java JRE binary (and therefore, the correct value of `JAVA_HOME` when it’s referenced) will vary based on the specifics of the machine it’s being installed on; that being said, you can find its exact location using `update-alternatives`: - -```bash -update-alternatives --list java -``` -Take a note of this path as we will need it later. - ### Creating a `tomcat` User Apache Tomcat, and all its processes, will be owned and managed by a specific user for the purposes of keeping parts of the stack segregated and accountable. @@ -109,10 +93,14 @@ Since version 5, Cantaloupe is released as a standalone Java application and is Releases of Cantaloupe live on the [Cantaloupe release page](https://github.com/cantaloupe-project/cantaloupe/releases); the latest version can be found here as a `.zip` file. ```bash -sudo wget -O /opt/cantaloupe.zip CANTALOUPE_RELEASE_URL -sudo unzip /opt/cantaloupe.zip +cd /opt/ +sudo wget -O cantaloupe.zip CANTALOUPE_RELEASE_URL +sudo unzip cantaloupe.zip +sudo mv [CANTALOUPE_VERSION] cantaloupe +sudo rm cantaloupe.zip ``` -- `CANTALOUPE_RELEASE_URL`: It’s recommended we grab the latest version of Cantaloupe 5. This can be found on the above-linked release page, as the `.zip` version; for example, https://github.com/cantaloupe-project/cantaloupe/releases/download/v5.0.3/cantaloupe-5.0.3.zip - make sure **not** to download the source code zip file as that isn't compiled for running out-of-the-box. +- `CANTALOUPE_RELEASE_URL`: It’s recommended we grab the latest version of Cantaloupe 5. This can be found on the above-linked release page, as the `.zip` version; for example, https://github.com/cantaloupe-project/cantaloupe/releases/download/v5.0.6/cantaloupe-5.0.6.zip - make sure **not** to download the source code zip file as that isn't compiled for running out-of-the-box. +- `CANTALOUPE_VERSION`: This will depend on the exact version of Cantaloupe downloaded; in the above example release, this would be `cantaloupe-5.0.6` ### Creating a Cantaloupe Configuration @@ -121,13 +109,49 @@ Cantaloupe pulls its configuration from a file called `cantaloupe.properties`; t Creating these files from scratch is *not* recommended; rather, we’re going to take the default cantaloupe configurations and plop them into their own folder so we can work with them. ```bash -sudo mkdir /opt/cantaloupe_config -sudo cp CANTALOUPE_VER/cantaloupe.properties.sample /opt/cantaloupe_config/cantaloupe.properties -sudo cp CANTALOUPE_VER/delegates.rb.sample /opt/cantaloupe_config/delegates.rb +cd cantaloupe +sudo cp cantaloupe.properties.sample cantaloupe.properties +sudo cp delegates.rb.sample delegates.rb ``` -- `CANTALOUPE_VER`: This will depend on the exact version of Cantaloupe downloaded; in the above example release, this would be `cantaloupe-5.0.3` -The out-of-the-box configuration will work fine for our purposes, but it’s highly recommended that you take a look through the `cantaloupe.properties` and see what changes can be made; specifically, logging to actual logfiles isn’t set up by default, so you may want to take a peek at the `log.application.SyslogAppender` or `log.application.RollingFileAppender`, as well as changing the logging level. +Most of the out-of-the-box configuration will work fine for our purposes. We will change the source lookup and logging, but it’s highly recommended that you take a look through the rest of the `cantaloupe.properties` and see what changes can be made. Review the config block below and change the related portions of yours to match. + +`/opt/cantaloupe/cantaloupe.properties` +``` +############## +# SOURCES +############## + +source.static = HttpSource + +############## +# HttpSource +############## + +HttpSource.BasicLookupStrategy.url_prefix = + +############## +# LOGGING +############## + +log.application.FileAppender.pathname = /var/log/islandora/cantaloupe-application.log + +log.application.RollingFileAppender.enabled = true +log.application.RollingFileAppender.pathname = /var/log/islandora/cantaloupe-application.log +log.application.RollingFileAppender.TimeBasedRollingPolicy.filename_pattern = /var/log/islandora/cantaloupe-application-%d{yyyy-MM-dd}.log + +log.error.FileAppender.pathname = /var/log/islandora/cantaloupe-error.log + +log.error.RollingFileAppender.enabled = true +log.error.RollingFileAppender.pathname = /var/log/islandora/cantaloupe-error.log +log.error.RollingFileAppender.TimeBasedRollingPolicy.filename_pattern = /var/log/islandora/cantaloupe-error-%d{yyyy-MM-dd}.log + +log.access.FileAppender.pathname = /var/log/islandora/cantaloupe-access.log + +log.access.RollingFileAppender.enabled = true +log.access.RollingFileAppender.pathname = /var/log/islandora/cantaloupe-access.log +log.access.RollingFileAppender.TimeBasedRollingPolicy.filename_pattern = /var/log/islandora/cantaloupe-access-%d{yyyy-MM-dd}.log +``` ### Installing and configuring Cantaloupe as a service @@ -139,7 +163,7 @@ Since it is a standalone application, we can configure Cantaloupe as a systemd s Description=Cantaloupe [Service] -ExecStart=java -cp /opt/CANTALOUPE_VER/CANTALOUPE_VER.jar -Dcantaloupe.config=/opt/cantaloupe_config/cantaloupe.properties -Xmx1500m -Xms1000m edu.illinois.library.cantaloupe.StandaloneEntry +ExecStart=java -cp /opt/cantaloupe/cantaloupe-CANTALOUPE_VER.jar -Dcantaloupe.config=/opt/cantaloupe/cantaloupe.properties -Xmx1500m -Xms1000m edu.illinois.library.cantaloupe.StandaloneEntry SyslogIdentifier=cantaloupe [Install] @@ -154,4 +178,11 @@ sudo systemctl enable cantaloupe sudo systemctl start cantaloupe ``` -We can check the service status with `sudo systemctl status cantaloupe | grep Active` and the splash screen of Cantaloupe should be available at http://localhost:8182 +We can check the service status with `sudo systemctl status cantaloupe | grep Active` and the splash screen of Cantaloupe should be available at http://localhost:8182/iiif/2. + +If you have trouble connecting, check the status of your port and allow it if necessary: + +```bash +sudo ufw status verbose +sudo ufw allow 8182/tcp +``` \ No newline at end of file diff --git a/docs/installation/manual/introduction.md b/docs/installation/manual/introduction.md index 8869e6a3a..c04119bb1 100644 --- a/docs/installation/manual/introduction.md +++ b/docs/installation/manual/introduction.md @@ -81,6 +81,33 @@ function do_something($to_this) { ``` - `THE_NUMBER_TO_ADD_TO_THIS`: 12, perhaps with an explanation of why, or other numbers that may be appropriate +### Account Tracker + +We will create several accounts during the installation process. For some items the instructions use *placeholders* in square brackets (`[]`). Create your own and use them in place of these placeholders as appropriate. + +- MySQL root account + - username: `root` + - password: `[mysql_root_password]` +- MySQL account for Drupal database access + - username: `[mysql_drupal]` + - password: `[mysql_drupal_password]` +- MySQL account for Fedora access + - username: `[mysql_fedora]` + - password: `[mysql_fedora_password]` +- Tomcat + - username: `tomcat` + - password: `[tomcat_user_password]` +- Fedora fedoraAdmin account + - username: `fedoraAdmin` + - password: `[fedora_admin_password]` +- Fedora fedoraUser account + - username: `fedoraUser` + - password: `[fedora_user_password]` +- ActiveMQ + - username: `[activemq_user]` + - password: `[activemq_user_password]` + + ### Troubleshooting The most common issues you will likely run into when manually provisioning a server are: diff --git a/docs/installation/manual/preparing-a-webserver.md b/docs/installation/manual/preparing-a-webserver.md index b4e3fc63e..e22ab70fd 100644 --- a/docs/installation/manual/preparing-a-webserver.md +++ b/docs/installation/manual/preparing-a-webserver.md @@ -5,10 +5,28 @@ ## In this section, we will install: +- [Java/OpenJDK](https://openjdk.org/) is the Java runtime environment used by multiple components: Solr, Cantaloupe, Alpaca, Fedora, and Blazegraph. - [Apache 2](https://httpd.apache.org/), the webserver that will deliver webpages to end users -- [PHP 7](https://www.php.net/), the runtime code interpreter that Drupal will use to generate webpages and other services via apache, as well as that Drush and Composer will use to run tasks from the command line -- Several modules for PHP 7 which are required to run the PHP code that Drupal and other applications will be executing -- [PostgreSQL 10](https://www.postgresql.org/), the database that Drupal will use for storage (as well as other applications down the line) +- [PHP 8](https://www.php.net/), the runtime code interpreter that Drupal will use to generate webpages and other services via apache, as well as that Drush and Composer will use to run tasks from the command line +- Several modules for PHP 8 which are required to run the PHP code that Drupal and other applications will be executing +- [MySQL](https://www.mysql.com/), the database that Drupal will use for storage (as well as other applications down the line) + +## Installing OpenJDK 11 + +Tomcat runs in a Java runtime environment, so we'll need one to continue. In our case, OpenJDK 11 is open-source, free to use, and can fairly simply be installed using `apt-get`: + +```bash +sudo apt-get -y install openjdk-11-jdk openjdk-11-jre +``` + +The installation of OpenJDK via `apt-get` establishes it as the de-facto Java runtime environment to be used on the system, so no further configuration is required. + +The resultant location of the java JRE binary (and therefore, the correct value of `JAVA_HOME` when it’s referenced) will vary based on the specifics of the machine it’s being installed on; that being said, you can find its exact location using `update-alternatives`: + +```bash +update-alternatives --list java +``` +Take a note of this path as we will need it later. ## Apache 2 @@ -52,63 +70,48 @@ sudo usermod -a -G `whoami` www-data sudo su `whoami` ``` -## PHP 7.4 +## PHP 8.x + +!! note "Installing Alternate Versions" + Although the instructions below will install PHP 8.3, the instructions should work for future versions by replacing the `8.3` with whatever version you are attempting to install. -### Install PHP 7.4 +### Install PHP 8.x -If you're running Debian 11 you should be able to install PHP 7.4 from the apt packages directly: +If you're running Ununtu 20.04+ you should be able to install PHP 8 from the apt packages directly, although the `ondrej/php` repository provides additional libraries: ```bash -sudo apt-get -y install php7.4 php7.4-cli php7.4-common php7.4-curl php7.4-dev php7.4-gd php7.4-imap php7.4-json php7.4-mbstring php7.4-opcache php7.4-xml php7.4-yaml php7.4-zip libapache2-mod-php7.4 php-pgsql php-redis php-xdebug unzip +sudo apt update +sudo add-apt-repository ppa:ondrej/php +sudo apt update +sudo apt install php8.3 libapache2-mod-php unzip +sudo apt install php8.3-{cli,common,curl,gd,imap,intl,mysql,opcache,redis,xdebug,xml,yaml,zip} ``` -If you're running Debian 10, the repository for the PHP 7.4 packages needs to be installed first: +Restart Apache to make the changes active: ```bash -sudo apt-get -y install lsb-release apt-transport-https ca-certificates -sudo wget -O /etc/apt/trusted.gpg.d/php.gpg https://packages.sury.org/php/apt.gpg -echo "deb https://packages.sury.org/php/ $(lsb_release -sc) main" | sudo tee /etc/apt/sources.list.d/php.list -sudo apt-get update -sudo apt-get -y install php7.4 php7.4-cli php7.4-common php7.4-curl php7.4-dev php7.4-gd php7.4-imap php7.4-json php7.4-mbstring php7.4-opcache php7.4-xml php7.4-yaml php7.4-zip libapache2-mod-php7.4 php-pgsql php-redis php-xdebug unzip +sudo systemctl restart apache2 ``` -This will install a series of PHP configurations and mods in `/etc/php/7.4`, including: +Installation directories created: -- A `mods-available` folder (from which everything is typically enabled by default) -- A configuration for PHP when run from Apache in the `apache2` folder -- A configuration for PHP when run from the command line - including when run via Drush - in the `cli` folder -- `unzip`, which is important for PHP’s zip module to function correctly despite it not being a direct dependency of the module. We will also need to unzip some things later, so this is convenient to have in place early in the installation process. +- `/etc/php/8.3` (this is where you can edit PHP settings, such as timeouts, as needed for your site) +- `/usr/bin/php8.3` -## PostgreSQL 11 -### Install PostgreSQL 11 +## MySQL -PostgreSQL can generally be easily installed using your operating system’s package manager. It is typically sensible to install the version the system recognizes as up-to-date. We’re simply going to install the database software: +### Install ```bash -sudo apt-get -y install postgresql +sudo apt install mysql-server ``` -This will install: - -- A user at the system level named `postgres`; this will be the only user, by default, that has permission to run the `psql` binary and have access to Postgres configurations -- A binary executable at `/usr/bin/psql`, which anyone - even `root` - will get kicked out of the moment they run it, since only the `postgres` user has permission to run any Postgres commands -- A series of configurations that live in `/etc/postgresql/11/main` which can be used to modify how PostgreSQL works. - -### Configure Postgresql 11 For Use With Drupal - -A modification needs to be made to the PostgreSQL configuration in order for Drupal to properly install and function. This change can be made to the main configuration file at `/etc/postgresql/11/main/postgresql.conf`: - -**Before**: -> 558 | #bytea_output = ‘hex’ # hex, escape - -**After**: -> 558 | bytea_output = ‘escape’ - -(Remove the "# hex, escape" comment and change the value from "hex" to "escape") - -The `postgresql` service should be restarted to accept the new configuration: +There are a few ways to check the MySQL status: ```bash -sudo systemctl restart postgresql -``` +sudo service mysql status # press "q" to exit +sudo ss -tap | grep mysql +sudo service mysql restart +sudo journalctl -u mysql # helps troubleshooting +``` \ No newline at end of file From e924239fcfa9b2b1d4ce8040ad4b4c637198123b Mon Sep 17 00:00:00 2001 From: Seth Shaw Date: Mon, 8 Jul 2024 15:21:44 -0700 Subject: [PATCH 2/8] Alpaca & Crayfish updates --- docs/installation/manual/installing-alpaca.md | 101 +++++-- .../manual/installing-crayfish.md | 246 +++++++++--------- 2 files changed, 197 insertions(+), 150 deletions(-) diff --git a/docs/installation/manual/installing-alpaca.md b/docs/installation/manual/installing-alpaca.md index 4dd9f3ab6..a1be6e736 100644 --- a/docs/installation/manual/installing-alpaca.md +++ b/docs/installation/manual/installing-alpaca.md @@ -10,7 +10,9 @@ ## Installing ActiveMQ -In our case, the default installation method for ActiveMQ via `apt-get` will suffice. +Some users have been able to install ActiveMQ from the standard package repositories. Others, however, have needed to install it manually. + +### Option 1: System Provided Packages ```bash sudo apt-get -y install activemq @@ -32,15 +34,69 @@ sudo apt-cache policy activemq Write down the version listed under `Installed: `. +### Option 2: Manual Install + +Git the latest ActiveMQ 5.x version number from https://archive.apache.org/dist/activemq which will be put in place of `[ACTIVEMQ_VERSION_NUMBER]`. + +```bash +cd /opt +sudo wget http://archive.apache.org/dist/activemq/[ACTIVEMQ_VERSION_NUMBER]/apache-activemq-[ACTIVEMQ_VERSION_NUMBER]-bin.tar.gz +sudo tar -xvzf apache-activemq-[ACTIVEMQ_VERSION_NUMBER]-bin.tar.gz +sudo mv apache-activemq-[ACTIVEMQ_VERSION_NUMBER] /opt/activemq +sudo addgroup --quiet --system activemq +sudo adduser --quiet --system --ingroup activemq --no-create-home --disabled-password activemq +sudo chown -R activemq:activemq /opt/activemq +sudo rm -R apache-activemq-[ACTIVEMQ_VERSION_NUMBER]-bin.tar.gz +``` + +Add ActiveMQ as a service: +`/etc/systemd/system/activemq.service | root:root/644 +``` +[Unit] +Description=Apache ActiveMQ +After=network.target + +[Service] +Type=forking +User=activemq +Group=activemq + +ExecStart=/opt/activemq/bin/activemq start +ExecStop=/opt/activemq/bin/activemq stop + +[Install] +WantedBy=multi-user.target +``` + +Update the WebConsolePort host property settings in `/opt/activemq/conf/jetty.xml` from `` to `` so that you can access the dashboard from outside the local machine. + +Optionally, change the dashboard user credentials in `/opt/activemq/conf/users.properties`. + +*Note* +> Updating the web console port and user properties are potential security holes. It is best to restrict the host setting and create a more secure username/password combination for production. + +Set the service to start on machine startup and start it up: +```bash +sudo systemctl daemon-reload +sudo systemctl start activemq +sudo systemctl enable activemq +sudo systemctl status activemq +sudo systemctl restart activemq +sudo apt-cache policy activemq # note version number +``` + +The service should now be available at `http://localhost:8161/` + + ## Installing Alpaca Install Java 11+ if you haven't already. Make a directory for Alpaca and download the latest version of Alpaca from the [Maven repository](https://repo1.maven.org/maven2/ca/islandora/alpaca/islandora-alpaca-app). E.g. ``` -mkdir /opt/alpaca +sudo mkdir /opt/alpaca cd /opt/alpaca -curl -L https://repo1.maven.org/maven2/ca/islandora/alpaca/islandora-alpaca-app/2.2.0/islandora-alpaca-app-2.2.0-all.jar -o alpaca.jar +sudo curl -L https://repo1.maven.org/maven2/ca/islandora/alpaca/islandora-alpaca-app/2.2.0/islandora-alpaca-app-2.2.0-all.jar -o alpaca.jar ``` ### Configuration @@ -55,7 +111,7 @@ The properties are: ``` # Common options -error.maxRedeliveries=4 +error.maxRedeliveries=5 ``` This defines how many times to retry a message before failing completely. @@ -79,11 +135,6 @@ jms.connections=10 ``` This defines the pool of connections to the ActiveMQ instance. -``` -jms.concurrent-consumers=1 -``` -This defines how many messages to process simultaneously. - #### islandora-indexing-fcrepo This service manages a Drupal node into a corresponding Fedora resource. @@ -108,20 +159,20 @@ These define the various queues to listen on for the indexing/deletion messages. The part after `queue:` should match your Islandora instance "Actions". ``` -fcrepo.indexer.milliner.baseUrl=http://localhost:8000/milliner +fcrepo.indexer.milliner.baseUrl=http://localhost/milliner ``` This defines the location of your Milliner microservice. ``` -fcrepo.indexer.concurrent-consumers=1 -fcrepo.indexer.max-concurrent-consumers=1 +fcrepo.indexer.concurrent-consumers=-1 +fcrepo.indexer.max-concurrent-consumers=-1 ``` These define the default number of concurrent consumers and maximum number of concurrent consumers working off your ActiveMQ instance. A value of `-1` means no setting is applied. ``` -fcrepo.indexer.async-consumer=true +fcrepo.indexer.async-consumer=false ``` This property allows the concurrent consumers to process concurrently; otherwise, the consumers will wait to the previous message has been processed before executing. @@ -134,7 +185,7 @@ It's properties are: ``` # Triplestore indexer options -triplestore.indexer.enabled=false +triplestore.indexer.enabled=true ``` This defines whether the Triplestore indexer is enabled or not. @@ -154,8 +205,8 @@ triplestore.baseUrl=http://localhost:8080/bigdata/namespace/kb/sparql This defines the location of your triplestore's SPARQL update endpoint. ``` -triplestore.indexer.concurrent-consumers=1 -triplestore.indexer.max-concurrent-consumers=1 +triplestore.indexer.concurrent-consumers=-1 +triplestore.indexer.max-concurrent-consumers=-1 ``` These define the default number of concurrent consumers and maximum number of concurrent @@ -164,7 +215,7 @@ A value of `-1` means no setting is applied. ``` -triplestore.indexer.async-consumer=true +triplestore.indexer.async-consumer=false ``` This property allows the concurrent consumers to process concurrently; otherwise, the consumers will wait to the previous message has been processed before executing. @@ -184,7 +235,7 @@ derivative..enabled=true This defines if the `item` service is enabled. ``` -derivative..in.stream=queue:islandora-item-connector.index +derivative..in.stream=queue:islandora-item-connector- ``` This is the input queue for the derivative microservice. @@ -197,8 +248,8 @@ derivative..service.url=http://example.org/derivative/convert This is the microservice URL to process the request. ``` -derivative..concurrent-consumers=1 -derivative..max-concurrent-consumers=1 +derivative..concurrent-consumers=-1 +derivative..max-concurrent-consumers=-1 ``` These define the default number of concurrent consumers and maximum number of concurrent @@ -207,7 +258,7 @@ A value of `-1` means no setting is applied. ``` -derivative..async-consumer=true +derivative..async-consumer=false ``` This property allows the concurrent consumers to process concurrently; otherwise, the consumers will wait to the previous message has been processed before executing. @@ -219,14 +270,14 @@ derivative.systems.installed=houdini,fits derivative.houdini.enabled=true derivative.houdini.in.stream=queue:islandora-connector-houdini -derivative.houdini.service.url=http://127.0.0.1:8000/houdini/convert +derivative.houdini.service.url=http://127.0.0.1/houdini/convert derivative.houdini.concurrent-consumers=1 derivative.houdini.max-concurrent-consumers=4 derivative.houdini.async-consumer=true derivative.fits.enabled=true derivative.fits.in.stream=queue:islandora-connector-fits -derivative.fits.service.url=http://127.0.0.1:8000/crayfits +derivative.fits.service.url=http://127.0.0.1/crayfits derivative.fits.concurrent-consumers=2 derivative.fits.max-concurrent-consumers=2 derivative.fits.async-consumer=false @@ -306,7 +357,7 @@ Description=Alpaca service After=network.target [Service] -Type=forking +Type=simple ExecStart=java -jar /opt/alpaca/alpaca.jar -c /opt/alpaca/alpaca.properties ExecStop=/bin/kill -15 $MAINPID SuccessExitStatus=143 @@ -316,4 +367,4 @@ Restart=always WantedBy=default.target ``` -Now you can start the service by running `systemctl start alpaca` and set it to come up when the system reboots with `systemctl enable alpaca`. +Now you can start the service by running `sudo systemctl start alpaca` and set it to come up when the system reboots with `sudo systemctl enable alpaca`. Check the status by running `sudo systemctl status alpaca`. diff --git a/docs/installation/manual/installing-crayfish.md b/docs/installation/manual/installing-crayfish.md index af9e113fb..cddf480d4 100644 --- a/docs/installation/manual/installing-crayfish.md +++ b/docs/installation/manual/installing-crayfish.md @@ -4,9 +4,51 @@ The manual installation documentation is in need of attention. We are aware that some components no longer work as documented here. If you are interested in helping us improve the documentation, please see [Contributing](../../../contributing/CONTRIBUTING). ## In this section, we will install: +- [FITS Web Service](https://projects.iq.harvard.edu/fits), a webservice for identifying file metadata - [Islandora/Crayfish](https://github.com/islandora/crayfish), the suite of microservices that power the backend of Islandora 2.0 - Indvidual microservices underneath Crayfish +## FITS Web Service + +The FITS Web Service is used to extract file metadata from files. The Crayfish microservice CrayFits will use this service to push FITS metadata back to Drupal. It comes in two pieces, the actual FITS tool and the FITS Webservice which runs in Tomcat. + +FITS itself wraps other file identification and metadata tools which may require installing additional libraries. On Ububtu 20.04, the version this guide is using, we will install a few: + +```bash +sudo apt install mediainfo python3-jpylyzer +``` + +To set up the FITS application, first find the [latest FITS version on GitHub](https://github.com/harvard-lts/fits/releases) to replace the `[FITS_VERSION_NUMBER]` and then run the following commands: + +```bash +cd /opt +sudo wget https://github.com/harvard-lts/fits/releases/download/[FITS_VERSION_NUMBER]/fits-[FITS_VERSION_NUMBER].zip +sudo unzip /opt/fits-[FITS_VERSION_NUMBER].zip -d /opt/fits +``` + +Similarly with the FITS webservice, [get the current service version number](https://github.com/harvard-lts/FITSservlet/releases) to replace `[FITS_SERVICE_WAR_VERSION_NUMBER]`: + +Download the FITS webservice: + +```bash +sudo -u tomcat wget -O /opt/tomcat/webapps/fits.war https://github.com/harvard-lts/FITSservlet/releases/download/[FITS_SERVICE_WAR_VERSION_NUMBER]/fits-service-[FITS_SERVICE_WAR_VERSION_NUMBER].war +``` + +Configure the webservice but adding the following lines to the bottom of `/opt/tomcat/conf/catalina.properties`: + +``` +fits.home=/opt/fits +shared.loader=/opt/fits/lib/*.jar +``` + +Restart Tomcat: + +``` +sudo systemctl restart tomcat +``` + +Wait for a few minutes to let the service start up the first time and then visit `http://localhost:8080/fits/` to ensure it is working. You can also follow the catalina logs to see how tomcat is progressing in setting up each service it is running: `sudo tail -f /opt/tomcat/logs/catalina.out`. To stop following the logs, hit control-C. + ## Crayfish 2.0 ### Installing Prerequisites @@ -19,13 +61,12 @@ Some packages need to be installed before we can proceed with installing Crayfis - Poppler, which will be used for generating PDFs ```bash +sudo apt-get install software-properties-common sudo add-apt-repository -y ppa:lyrasis/imagemagick-jp2 sudo apt-get update sudo apt-get -y install imagemagick tesseract-ocr ffmpeg poppler-utils ``` -**NOTICE:** If you get the `sudo: apt-add-repository: command not found`, run `sudo apt-get install software-properties-common` in order to make the command available. - ### Cloning and Installing Crayfish We’re going to clone Crayfish to `/opt`, and individually run `composer install` against each of the microservice subdirectories. @@ -39,6 +80,7 @@ sudo -u www-data composer install -d crayfish/Houdini sudo -u www-data composer install -d crayfish/Hypercube sudo -u www-data composer install -d crayfish/Milliner sudo -u www-data composer install -d crayfish/Recast +sudo -u www-data composer install -d crayfish/CrayFits ``` ### Preparing Logging @@ -58,43 +100,27 @@ Each Crayfish component requires one or more `.yaml` file(s) to ensure everythin The following configuration files represent somewhat sensible defaults; you should take consideration of the logging levels in use, as this can vary in desirability from installation to installation. Also note that in all cases, `http` URLs are being used, as this guide does not deal with setting up https support. In a production installation, this should not be the case. These files also assume a connection to a PostgreSQL database; use a `pdo_mysql` driver and the appropriate `3306` port if using MySQL. +*Note:* +> For Crayfish microservices use the `lexik_jwt_authentication` package. They are configured to use the `JWT_PUBLIC_KEY` environment variable to find the public key we created earlier (`/opt/keys/syn_public.key`). Later on in this guide we will add the environment variable to the Apache configs, but you may alternatively write the path to the key in the `lexik_jwt_authentication.yaml` file that resides along-side the `security.yaml` files we edit in this section. + #### Homarus (Audio/Video derivatives) -`/opt/crayfish/Homarus/cfg/config.yaml | www-data:www-data/644` -```yaml ---- -homarus: - executable: ffmpeg - mime_types: - valid: - - video/mp4 - - video/x-msvideo - - video/ogg - - audio/x-wav - - audio/mpeg - - audio/aac - - image/jpeg - - image/png - default: video/mp4 - mime_to_format: - valid: - - video/mp4_mp4 - - video/x-msvideo_avi - - video/ogg_ogg - - audio/x-wav_wav - - audio/mpeg_mp3 - - audio/aac_m4a - - image/jpeg_image2pipe - - image/png_image2pipe - default: mp4 -fedora_resource: - base_url: http://localhost:8080/fcrepo/rest -log: - level: NOTICE - file: /var/log/islandora/homarus.log -syn: - enable: true - config: /opt/fcrepo/config/syn-settings.xml +Enable JSON Web Token (JWT) based access to the service by updating the security settings. Edit `/opt/crayfish/Homarus/config/packages/security.yaml` to set firewalls: main: anonymous to `false` and uncomment the `provider` and `jwt` lines further down in that section. + +Edit `/opt/crayfish/Homarus/config/packages/monolog.yaml` to point to the new logging directory: + +```yml + homarus: + type: rotating_file + path: /var/logs/islandora/Homarus.log +``` + +Edit the commons config to update it with Fedora's location (if necessary) and enable the apix middleware in `/opt/crayfish/Homarus/config/packages/crayfish_commons.yaml`: + +```yml +crayfish_commons: + fedora_base_uri: 'http://localhost:8080/fcrepo/rest' + apix_middleware_enabled: true ``` #### Houdini (Image derivatives) @@ -108,7 +134,7 @@ Currently the Houdini microservice uses a different system (Symfony) than the ot # Put parameters here that don't need to change on each machine where the app is deployed # https://symfony.com/doc/current/best_practices/configuration.html#application-related-configuration parameters: - app.executable: /usr/local/bin/convert + app.executable: /usr/bin/convert app.formats.valid: - image/jpeg - image/png @@ -142,14 +168,15 @@ services: # please note that last definitions always *replace* previous ones ``` -`/opt/crayfish/Houdini/config/packages/crayfish_commons.yml | www-data:www-data/644` +`/opt/crayfish/Houdini/config/packages/crayfish_commons.yaml | www-data:www-data/644` ```yaml crayfish_commons: fedora_base_uri: 'http://localhost:8080/fcrepo/rest' - syn_config: '/opt/fcrepo/config/syn-settings.xml' + syn_config: /opt/fcrepo/config/syn-settings.xml + syn_enabled: True ``` -`/opt/crayfish/Houdini/config/packages/monolog.yml | www-data:www-data/644` +`/opt/crayfish/Houdini/config/packages/monolog.yaml | www-data:www-data/644` ```yaml monolog: @@ -164,38 +191,38 @@ monolog: The below files are two versions of the same file to enable or disable JWT token authentication. -`/opt/crayfish/Houdini/config/packages/security.yml | www-data:www-data/644` +`/opt/crayfish/Houdini/config/packages/security.yaml | www-data:www-data/644` Enabled JWT token authentication: ```yaml +# To disable Syn checking, set syn_enabled=false in crayfish_commons.yaml and remove this configuration file. security: # https://symfony.com/doc/current/security.html#where-do-users-come-from-user-providers providers: - jwt_user_provider: - id: Islandora\Crayfish\Commons\Syn\JwtUserProvider - + users_in_memory: { memory: null } + jwt: + lexik_jwt: ~ firewalls: dev: pattern: ^/(_(profiler|wdt)|css|images|js)/ security: false main: + # To enable Syn, change anonymous to false and uncomment the lines further below anonymous: false # Need stateless or it reloads the User based on a token. stateless: true - provider: jwt_user_provider - guard: - authenticators: - - Islandora\Crayfish\Commons\Syn\JwtAuthenticator + # To enable JWT authentication, uncomment the below 2 lines and change anonymous to false above. + provider: jwt + jwt: ~ # activate different ways to authenticate - # https://symfony.com/doc/current/security.html#firewalls-authentication + # https://symfony.com/doc/5.4/security.html#firewalls-authentication - # https://symfony.com/doc/current/security/impersonating_user.html + # https://symfony.com/doc/5.4/security/impersonating_user.html # switch_user: true - # Easy way to control access for large sections of your site # Note: Only the *first* access control that matches will be used access_control: @@ -224,74 +251,38 @@ security: #### Hypercube (OCR) -`/opt/crayfish/Hypercube/cfg/config.yaml | www-data:www-data/644` -```yaml ---- -hypercube: - tesseract_executable: tesseract - pdftotext_executable: pdftotext -fedora_resource: - base_url: http://localhost:8080/fcrepo/rest -log: - level: NOTICE - file: /var/log/islandora/hypercube.log -syn: - enable: true - config: /opt/fcrepo/config/syn-settings.xml +Enable JSON Web Token (JWT) based access to the service by updating the security settings. Edit `/opt/crayfish/Hypercube/config/packages/security.yaml` to set firewalls: main: anonymous to `false` and uncomment the `provider` and `jwt` lines further down in that section. + +Edit `/opt/crayfish/Hypercube/config/packages/monolog.yaml` to point to the new logging directory: + +```yml + hypercube: + type: rotating_file + path: /var/logs/islandora/Hypercube.log +``` + +Edit the commons config to update it with Fedora's location (if necessary) and enable the apix middleware in `/opt/crayfish/Hypercube/config/packages/crayfish_commons.yaml`: + +```yml +crayfish_commons: + fedora_base_uri: 'http://localhost:8080/fcrepo/rest' + apix_middleware_enabled: true ``` #### Milliner (Fedora indexing) -`/opt/crayfish/Milliner/cfg/config.yaml | www-data:www-data/644` -```yaml ---- -fedora_base_url: http://localhost:8080/fcrepo/rest -drupal_base_url: http://localhost -modified_date_predicate: http://schema.org/dateModified -strip_format_jsonld: true -debug: false -db.options: - driver: pdo_pgsql - host: 127.0.0.1 - port: 5432 - dbname: CRAYFISH_DB - user: CRAYFISH_DB_USER - password: CRAYFISH_DB_PASSWORD -log: - level: NOTICE - file: /var/log/islandora/milliner.log -syn: - enable: true - config: /opt/fcrepo/config/syn-settings.xml -``` - -#### Recast (Drupal to Fedora URI re-writing) - -`/opt/crayfish/Recast/cfg/config.yaml | www-data:www-data/644` -```yaml ---- -fedora_resource: - base_url: http://localhost:8080/fcrepo/rest -drupal_base_url: http://localhost -debug: false -log: - level: NOTICE - file: /var/log/islandora/recast.log -syn: - enable: true - config: /opt/fcrepo/config/syn-settings.xml -namespaces: -- - acl: "http://www.w3.org/ns/auth/acl#" - fedora: "http://fedora.info/definitions/v4/repository#" - ldp: "http://www.w3.org/ns/ldp#" - memento: "http://mementoweb.org/ns#" - pcdm: "http://pcdm.org/models#" - pcdmuse: "http://pcdm.org/use#" - webac: "http://fedora.info/definitions/v4/webac#" - vcard: "http://www.w3.org/2006/vcard/ns#" +Enable JSON Web Token (JWT) based access to the service by updating the security settings. Edit `/opt/crayfish/Milliner/config/packages/security.yaml` to set firewalls: main: anonymous to `false` and uncomment the `provider` and `jwt` lines further down in that section. + +Edit `/opt/crayfish/Milliner/config/packages/monolog.yaml` to point to the new logging directory: + +```yml + milliner: + type: rotating_file + path: /var/logs/islandora/Milliner.log ``` +Edit the commons config to update it with Fedora's location (if necessary) and enable the apix middleware in `/opt/crayfish/Milliner/config/packages/crayfish_commons.yaml`: + ### Creating Apache Configurations for Crayfish Components Finally, we need appropriate Apache configurations for Crayfish; these will allow other services to connect to Crayfish components via their HTTP endpoints. @@ -304,11 +295,12 @@ These configurations would potentially have collisions with Drupal routes, if an `/etc/apache2/conf-available/Homarus.conf | root:root/644` ``` -Alias "/homarus" "/opt/crayfish/Homarus/src" - +Alias "/homarus" "/opt/crayfish/Homarus/public" + FallbackResource /homarus/index.php Require all granted DirectoryIndex index.php + SetEnv JWT_PUBLIC_KEY /opt/keys/syn_public.key SetEnvIf Authorization "(.*)" HTTP_AUTHORIZATION=$1 ``` @@ -320,39 +312,43 @@ Alias "/houdini" "/opt/crayfish/Houdini/public" FallbackResource /houdini/index.php Require all granted DirectoryIndex index.php + SetEnv JWT_PUBLIC_KEY /opt/keys/syn_public.key SetEnvIf Authorization "(.*)" HTTP_AUTHORIZATION=$1 ``` `/etc/apache2/conf-available/Hypercube.conf | root:root/644` ``` -Alias "/hypercube" "/opt/crayfish/Hypercube/src" - +Alias "/hypercube" "/opt/crayfish/Hypercube/public" + FallbackResource /hypercube/index.php Require all granted DirectoryIndex index.php + SetEnv JWT_PUBLIC_KEY /opt/keys/syn_public.key SetEnvIf Authorization "(.*)" HTTP_AUTHORIZATION=$1 ``` `/etc/apache2/conf-available/Milliner.conf | root:root/644` ``` -Alias "/milliner" "/opt/crayfish/Milliner/src" - +Alias "/milliner" "/opt/crayfish/Milliner/public" + FallbackResource /milliner/index.php Require all granted DirectoryIndex index.php + SetEnv JWT_PUBLIC_KEY /opt/keys/syn_public.key SetEnvIf Authorization "(.*)" HTTP_AUTHORIZATION=$1 ``` -`/etc/apache2/conf-available/Recast.conf | root:root/644` +`/etc/apache2/conf-available/CrayFits.conf | root:root/644` ``` -Alias "/recast" "/opt/crayfish/Recast/src" - - FallbackResource /recast/index.php +Alias "/crayfits" "/opt/crayfish/CrayFits/public" + + FallbackResource /crayfits/index.php Require all granted DirectoryIndex index.php + SetEnv JWT_PUBLIC_KEY /opt/keys/syn_public.key SetEnvIf Authorization "(.*)" HTTP_AUTHORIZATION=$1 ``` @@ -362,7 +358,7 @@ Alias "/recast" "/opt/crayfish/Recast/src" Enabling each of these configurations involves creating a symlink to them in the `conf-enabled` directory; the standardized method of doing this in Apache is with `a2enconf`. ```bash -sudo a2enconf Homarus Houdini Hypercube Milliner Recast +sudo a2enconf Homarus Houdini Hypercube Milliner CrayFits ``` ### Restarting the Apache Service From dcc78dc0599b1a90af5526e6a8c966f82483db7a Mon Sep 17 00:00:00 2001 From: Seth Shaw Date: Mon, 8 Jul 2024 15:22:11 -0700 Subject: [PATCH 3/8] =?UTF-8?q?Postgres=E2=86=92MySQL?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- docs/installation/manual/configuring-drupal.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/installation/manual/configuring-drupal.md b/docs/installation/manual/configuring-drupal.md index af4fdd929..640ab9c69 100644 --- a/docs/installation/manual/configuring-drupal.md +++ b/docs/installation/manual/configuring-drupal.md @@ -20,13 +20,13 @@ The below configuration will establish `localhost` as a trusted host pattern, bu **Before** (at around line 789): ``` -'driver' => 'pgsql', +'driver' => 'mysql', ); ``` **After**: ``` -'driver' => 'pgsql', +'driver' => 'mysql', ); $settings['trusted_host_patterns'] = [ From af79e5e2c854b9a998cc61103d0d8969db7538e8 Mon Sep 17 00:00:00 2001 From: Seth Shaw Date: Mon, 8 Jul 2024 15:32:30 -0700 Subject: [PATCH 4/8] add missing user to chown --- .../installation/manual/installing-composer-drush-and-drupal.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/installation/manual/installing-composer-drush-and-drupal.md b/docs/installation/manual/installing-composer-drush-and-drupal.md index 0982a0058..0d055d138 100644 --- a/docs/installation/manual/installing-composer-drush-and-drupal.md +++ b/docs/installation/manual/installing-composer-drush-and-drupal.md @@ -42,7 +42,7 @@ Islandora modules as you desire. On a default Ubuntu install the `/var/www` directory is owned by root, but we want the webserver to control this space, so we'll give it ownership: ```bash -sudo chown -R /var/www +sudo chown -R www-data: /var/www ``` From 05528e6718249bab4bfd6c3987c77c620bdb136b Mon Sep 17 00:00:00 2001 From: Seth Shaw <108362375+seth-shaw-asu@users.noreply.github.com> Date: Wed, 17 Jul 2024 09:54:36 -0700 Subject: [PATCH 5/8] update note formatting --- docs/installation/manual/installing-alpaca.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/docs/installation/manual/installing-alpaca.md b/docs/installation/manual/installing-alpaca.md index a1be6e736..eec7065cd 100644 --- a/docs/installation/manual/installing-alpaca.md +++ b/docs/installation/manual/installing-alpaca.md @@ -72,8 +72,8 @@ Update the WebConsolePort host property settings in `/opt/activemq/conf/jetty.xm Optionally, change the dashboard user credentials in `/opt/activemq/conf/users.properties`. -*Note* -> Updating the web console port and user properties are potential security holes. It is best to restrict the host setting and create a more secure username/password combination for production. +!!! note "Security Warning" + Updating the web console port and user properties are potential security holes. It is best to restrict the host setting and create a more secure username/password combination for production. Set the service to start on machine startup and start it up: ```bash @@ -318,7 +318,8 @@ http.additional_options=authMethod=Basic,authUsername=Jim,authPassword=1234 These will be added to ALL http endpoint requests. -**Note**: We are currently running Camel 3.7.6, some configuration parameters on the above linked page might not be supported. +!!! note "Check Camel Configuration Parameters" + We are currently running Camel 3.7.6, some configuration parameters on the above linked page might not be supported. ### Deploying/Running From baf2073302319ac0a72ef87e3c034339e9c6338d Mon Sep 17 00:00:00 2001 From: Seth Shaw <108362375+seth-shaw-asu@users.noreply.github.com> Date: Wed, 17 Jul 2024 09:56:07 -0700 Subject: [PATCH 6/8] fix note formatting --- docs/installation/manual/installing-crayfish.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/installation/manual/installing-crayfish.md b/docs/installation/manual/installing-crayfish.md index cddf480d4..1b045444e 100644 --- a/docs/installation/manual/installing-crayfish.md +++ b/docs/installation/manual/installing-crayfish.md @@ -100,8 +100,8 @@ Each Crayfish component requires one or more `.yaml` file(s) to ensure everythin The following configuration files represent somewhat sensible defaults; you should take consideration of the logging levels in use, as this can vary in desirability from installation to installation. Also note that in all cases, `http` URLs are being used, as this guide does not deal with setting up https support. In a production installation, this should not be the case. These files also assume a connection to a PostgreSQL database; use a `pdo_mysql` driver and the appropriate `3306` port if using MySQL. -*Note:* -> For Crayfish microservices use the `lexik_jwt_authentication` package. They are configured to use the `JWT_PUBLIC_KEY` environment variable to find the public key we created earlier (`/opt/keys/syn_public.key`). Later on in this guide we will add the environment variable to the Apache configs, but you may alternatively write the path to the key in the `lexik_jwt_authentication.yaml` file that resides along-side the `security.yaml` files we edit in this section. +!!! note "Using JWT for Crayfish Authentication" + For Crayfish microservices use the `lexik_jwt_authentication` package. They are configured to use the `JWT_PUBLIC_KEY` environment variable to find the public key we created earlier (`/opt/keys/syn_public.key`). Later on in this guide we will add the environment variable to the Apache configs, but you may alternatively write the path to the key in the `lexik_jwt_authentication.yaml` file that resides along-side the `security.yaml` files we edit in this section. #### Homarus (Audio/Video derivatives) From 0d846411b841d404f7cdc033b9cf82beceecbe33 Mon Sep 17 00:00:00 2001 From: Seth Shaw <108362375+seth-shaw-asu@users.noreply.github.com> Date: Wed, 17 Jul 2024 09:59:12 -0700 Subject: [PATCH 7/8] update note formatting --- docs/installation/manual/installing-crayfish.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/installation/manual/installing-crayfish.md b/docs/installation/manual/installing-crayfish.md index 1b045444e..8bd9e532b 100644 --- a/docs/installation/manual/installing-crayfish.md +++ b/docs/installation/manual/installing-crayfish.md @@ -96,7 +96,7 @@ sudo chown www-data:www-data /var/log/islandora Each Crayfish component requires one or more `.yaml` file(s) to ensure everything is wired up correctly. -**NOTICE** +!!! note "Update the defaults to meet your needs" The following configuration files represent somewhat sensible defaults; you should take consideration of the logging levels in use, as this can vary in desirability from installation to installation. Also note that in all cases, `http` URLs are being used, as this guide does not deal with setting up https support. In a production installation, this should not be the case. These files also assume a connection to a PostgreSQL database; use a `pdo_mysql` driver and the appropriate `3306` port if using MySQL. From 1311c90f4706745fc51a635fd030e722b6956163 Mon Sep 17 00:00:00 2001 From: Seth Shaw <108362375+seth-shaw-asu@users.noreply.github.com> Date: Wed, 17 Jul 2024 10:00:00 -0700 Subject: [PATCH 8/8] change note to warning --- docs/installation/manual/installing-alpaca.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/installation/manual/installing-alpaca.md b/docs/installation/manual/installing-alpaca.md index eec7065cd..4438f9c92 100644 --- a/docs/installation/manual/installing-alpaca.md +++ b/docs/installation/manual/installing-alpaca.md @@ -72,7 +72,7 @@ Update the WebConsolePort host property settings in `/opt/activemq/conf/jetty.xm Optionally, change the dashboard user credentials in `/opt/activemq/conf/users.properties`. -!!! note "Security Warning" +!!! warning "Security Warning" Updating the web console port and user properties are potential security holes. It is best to restrict the host setting and create a more secure username/password combination for production. Set the service to start on machine startup and start it up: