Skip to content

FrameNetBrasil/span-finder

 
 

Repository files navigation

span-finder FN-Br

This is a fork of the original span-finder with adaptations made by FrameNet Brasil.

The model is dockerized to make it easier to use.

Building the container

Make sure you have docker installed and then run at the root of the repo:

docker build . -t "<tag-name>"

A good <tag-name> to make things easier is lome.

Training

When the docker image is built, it already configures relevant paths for the training procedure. To run the default training, two volumes must be mapped:

  • Data: this is where the training data is located, the folder must contain the files train.jsonl, dev.jsonl, test.jsonl and ontology;
  • Checkpoint: this is the folder in the host machine where the model checkpoints will be saved (There's no need to create this folder, only map it, the docker process will create it automatically).

Suppose these two folders are data and model-checkpoint in the current folder. The run command should be:

sudo docker run -v $(pwd)/data/:/srv/data -v $(pwd)/model-checkpoint:/srv/checkpoint/ --gpus all -it "lome"

Once inside the container, you can run the following training command:

allennlp train -s $CHECKPOINT_PATH --include-package sftp config/fn.jsonnet

It is important to note that $CHECKPOINT_PATH is already set on the image. To check all the default configurations for paths, check .env.default. Just be careful: if any of those paths are changed, the mapping of volumes need to change as well. The most important value is CUDA. By default, the process will try to use cuda:0. To train on CPU, use the following command when running the container:

sudo docker run -e CUDA=[-1] -v $(pwd)/data/:/srv/data -v $(pwd)/model-checkpoint:/srv/checkpoint/ -it "lome"

or set the variable inside the container:

export CUDA=[-1]

Finally, to make changes to training parameters, make a copy of config/fn.jsonnet, change parameters and map the new file to the container using -v when running. Then only the last part of the training command needs to be changed:

allennlp train -s $CHECKPOINT_PATH --include-package sftp <path-to-jsonnet>

After the training is done, the ckeckpoints should be available at model-checkpoint. If docker was executed as sudo, just change permissions to see the results.

sudo chown -R $USER ./model-checkpoint/

About

Parse sentences by finding & labeling spans

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 86.2%
  • HTML 11.0%
  • Jsonnet 2.2%
  • Other 0.6%