The LAPIC2 evaluation base is available to researchers on semantic video annotation and recommendadtion systems. The base was created with automated transcripts : a set of automated transcripts, generated using CMU Sphinx, a HUB4 acoustic model and a Gigaword-derived language model; a set of titles e relationships for the programmes BBC; a download.sh for download the programmes.
The base has free license for academic use.
The documents were originally made available by this work:
@article{raimond2012automated,
title={Automated interlinking of speech radio archives.},
author={Raimond, Yves and Lowis, Chris},
journal={LDOW},
volume={937},
year={2012}
}
However, if you use this benchmark in your research, please cite:
@mastersthesis{dias_lapic2_20017,
title={Analysis of Automatic Semantic Annotation Approaches for Noisy Texts and Their Impacts on Similarity between Videos},
author={Dias, Laura Lima},
year={2017},
school={Universidade Federal de Juiz de Fora}
}