Transformer-based-TWG-parsing

Statistical Parsing for Tree Wrapping Grammars with Transformer-based supertagging and A-star parsing

This the repository for the experiments for the LREC 2022 submission with the title "RRGparbank: A Parallel Role and Reference Grammar Treebank"

Installation

Install ParTAGe-TWG.

Also install the packages from the requirements.txt file.

The code works with the Python version 3.9

Download language model

Here is the list of language models described in LREC paper:

Multilingual Model: Fine-tuned bert-base-multilingual-cased model download (1.7 GB)
English Model: Fine-tuned bert-base-cased model download (1.1 GB)
German Model: Fine-tuned bert-base-german-cased model download (1.1 GB)
French Model: Fine-tuned camembert-base model download (1.1 GB)
Russian Model: Fine-tuned rubert-base-cased-sentence model download (1.4 GB)
Multilingual DistilBERT: Fine-tuned distilbert-base-multilingual-cased model download (1 GB)

Use downloaded model

Unzip the downloaded model and rename the folder with the unzipped model to "best_model".

Parse sentences

Parse a file with sentences using the file parse_twg.

It takes two arguments - input file with plain sentences and output file.

Please take a look at the example input and output files:

python parse_twg.py example_input_file.txt example_output_file.txt

The output format of the output file is discbracket (discontinuous bracket trees). Read more about this format here.

Please note that for the French model you need to rename the model name from "bert" to "camembert":

language_model = NERModel(
    "bert", "best_model", use_cuda=device # for French, replace "bert" with "camembert"
)

To use DistilBERT model, rename the model name from "bert" to "distilbert":

language_model = NERModel(
    "distilbert", "best_model", use_cuda=device 
)

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
discodop_n		discodop_n
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
backtransformation.py		backtransformation.py
example_input_file.txt		example_input_file.txt
example_output_file.txt		example_output_file.txt
parse_twg.py		parse_twg.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformer-based-TWG-parsing

Installation

Download language model

Use downloaded model

Parse sentences

About

Releases

Packages

Languages

License

filemon11/Transformer-based-TWG-parsing

Folders and files

Latest commit

History

Repository files navigation

Transformer-based-TWG-parsing

Installation

Download language model

Use downloaded model

Parse sentences

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages