GaLAHaD

Generating Linguistic Annotations for Historical Dutch

GaLAHaD-related Repositories

Goal

Galahad is developed as part of the CLARIAH "Improved Infrastructure for Historical Dutch" project. The goal is an application that:

enables linguïsts:
- to check which taggers are suitable for tagging their corpus.
- to have their corpus tagged
enables computational linguists:
- to provide their models through a unified interface
- to have their model evaluated

This is provided through a platform that offers:

statistics for submitted models on existing corpora
a corpus annotation service
instructions on how to submit a model

Note that this infrastructure can also be of interest for other languages and eras.

Team

Principal engineer

[email protected]

Scientific advisors

Jesse de Does
Katrien Depuydt

Quick start

Do you have docker and docker-compose? Do you have access to the public Docker Hub instituutnederlandsetaal? Then you can clone this repository and run

docker compose up

This requires an external taggers network to exists. You can use the docker-compose.yml from https://github.com/INL/galahad-taggers-dockerized to start a taggers network.

To run Galahad locally. The webclient is available on port 8080.

Setup for development

Clone the code.

git clone https://github.com/INL/Galahad.git

The client

Start the client.

cd galahad/client

npm install

npm run dev

Go to http://localhost:5173/ in the browser to check the client development server is running.

The server

Go to your favourite IDE and open the Gradle project in galahad/server.

For development, add spring.profiles.active=dev to the environment variables. If you are using IntelliJ, simply use server/.run/GalahadApplication.run.xml. This is needed to differentiate whether we are in a docker container (production) or on the localhost (development), which in turn changes how we must communicate with the taggers (via a docker network or via the localhost).

Run galahad/server/src/main/kotlin/org/ivdnt/galahad/app/GalahadApplication.kt from your IDE. Check http://localhost:8010 to see whether see server is running.

Go back to the client in the browser and try to create a corpus and upload some documents.

The taggers

In development the application will talk to the taggers through a port-forward. The port-forwards are defined in docker-compose.yml from https://github.com/INL/galahad-taggers-dockerized. The port-forwards should be defined accordingly as devport in the taggers specifications at server/data/taggers/*.yaml to enable communication.

Adding a new tagger

Asssuming you have already wrapped your tagger in a Docker image.

First, launch your tagger. See https://github.com/INL/galahad-taggers-dockerized.

Now make Galahad aware of the new tagger by creating a tagger metadata yaml file. See server/data/taggers/ in this repo for examples.

Make the specification yaml available to Galahad:

If you are running Galahad server from a docker container, the specification yaml should be placed on the docker volume at data/taggers/.
If you are running Galahad server otherwise e.g. from your IDE, you can add the specifications yaml directly to server/data/taggers/

Refresh the browser to load the new tagger.

Adding admins

You can configure the admins account through a file admins.txt. Add the desired admin users one per line. To update the file (create it if it does not exists):

docker compose exec server sh
cd data
vi admins.txt # make your edits

App should autoreload and update to the new status, but refresh client just to be sure.

Supported file formats

Plain text, TSV, CoNLL-U, TEI, NAF, FoLia. For more details, see the help screen on formats on the GaLAHaD website.

Technical notes

Swagger UI

Once you have launched the application, you can explore the public API at

http://localhost:8010/swagger-ui.html

application BasePath

The INT runs the application behind a portal on a path /galahad. Therefore this is set as the default path for the application. Changing this basePath requires to at least rebuild the client application with a different vite build --base=/galahad/ set.

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
.github		.github
client		client
proxy		proxy
scripts		scripts
server		server
.env		.env
.gitignore		.gitignore
LICENSE		LICENSE
codemeta-harvest.json		codemeta-harvest.json
docker-compose.override.yml		docker-compose.override.yml
docker-compose.yml		docker-compose.yml
readme.md		readme.md
version.yml.template		version.yml.template

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GaLAHaD

GaLAHaD-related Repositories

Goal

Team

Principal engineer

Scientific advisors

Quick start

Setup for development

The client

The server

The taggers

Adding a new tagger

Adding admins

Supported file formats

Technical notes

Swagger UI

application BasePath

About

Releases 10

Packages

Contributors 2

Languages

License

INL/galahad

Folders and files

Latest commit

History

Repository files navigation

GaLAHaD

GaLAHaD-related Repositories

Goal

Team

Principal engineer

Scientific advisors

Quick start

Setup for development

The client

The server

The taggers

Adding a new tagger

Adding admins

Supported file formats

Technical notes

Swagger UI

application BasePath

About

Resources

License

Stars

Watchers

Forks

Releases 10

Packages 0

Contributors 2

Languages

Packages