The corpus is made of 8000 automatically annotated sentences. It is organized as follows:
data/ensemble.txt
, the collection of plain text sentences; one sentence per line.data/ensemble.ann
, the collection of annotations in brat standoff format (https://brat.nlplab.org/standoff.html).data/ensemble.scr
, the collection of agreement score between the ensembled systems in the corresponding sentences; one score per line.
Ensembled Corpus from the eHealth-KD 2019 Challenge is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Based on a work at https://github.com/jpconsuegra/ehealthkd-2019-ensembled-corpus.