Grammatical Error Correction for Romanin

This repository contains the code and data for: romanian grammatical error correction (GEC) on RONACC.

Download Data

Download the RONACC corpus: RONACC

Tokenized RONACC corpus: RONACC extra

Download pre-trained models

Download the language model: 30mil_wiki_lm
Download the synthetic corpus 10m_synthetic
Download trained Transformer-based fine-tune model: transformer-base-fine-tune

Run Experiment

Install python dependencies:
pip3 install -r requirements.txt
If you want to use LM predictions install kenlm libraries: kenlm
To run decoding on an existing model run:
python3 transformer.py --checkpoint=path_to_model_checkpoint --lm_path=path_to_lm --d_model=size_of_model --decode_mode=True
(the size of the fine tuned model is 768)
To train models run:
python3 transformer.py --checkpoint=path_to_model_checkpoint --separate=False --d_model=size_of_model --use_txt=True --dataset_file=path_to_txt_file_wrong_gold --train_mode=True

If you want to run on tpu, you can use the --use_tpu=True argument, but you need to generated tf records file.

ERRANT

Install ERRANT

You can use errant normall, just pass the argument -lang ro if you want to use it for Romanian. More details in the ERRANT readme.

Citing

@inproceedings{cotet2020neural,
  title={Neural grammatical error correction for romanian},
  author={Cotet, Teodor-Mihai and Ruseti, Stefan and Dascalu, Mihai},
  booktitle={2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI)},
  pages={625--631},
  year={2020},
  organization={IEEE}
}

Name		Name	Last commit message	Last commit date
Latest commit History 276 Commits
errant @ 0cb0f61		errant @ 0cb0f61
m2scorer		m2scorer
synthetic_generate		synthetic_generate
transformer		transformer
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
beam_search.py		beam_search.py
processings_cna.py		processings_cna.py
requirements.txt		requirements.txt
run_experiments.sh		run_experiments.sh
testing.py		testing.py
transformer.py		transformer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Grammatical Error Correction for Romanin

Download Data

Download pre-trained models

Run Experiment

ERRANT

Install ERRANT

Citing

About

Releases

Packages

Languages

License

teodor-cotet/RoGEC

Folders and files

Latest commit

History

Repository files navigation

Grammatical Error Correction for Romanin

Download Data

Download pre-trained models

Run Experiment

ERRANT

Install ERRANT

Citing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages