Repo for tracking resources for the Mezzanine project
Origin | Manual selection |
Description: | 70-10-20 train dev test split for ROG-Artur |
Created: | 2024-10-01T |
By: | Darinka |
Origin | Iriss-disfl-conll-pros with manually annotated Dialog acts |
Description: | This folder comprises of all current data sources: disfluencies, morpho-syntactic, prosodic units, dialog acts |
Created: | 2024-10-10T11:54:59 |
By: | Tjaša Vovko, Peter Rupnik |
Origin | Iriss-disfl-anno-phase5-fin-corr + UD-SST-split + iriss-prosodic-units |
Description: | Merged data from EXB, CONLL, textgrid (prosodic units) |
Created: | 2024-07-30T09:20:18 |
By: | Peter, Nikola |
Origin | commit 42738b98a6bcc38458a16a2dcf14d3c134e23539 on UD_Slovenian-SST |
Description: | SST conllu files, grouped on recording level. |
Created: | 2024-07-30T09:18:48 |
By: | Peter |
Origin | Iriss-disfl-anno-phase5-fin |
Description: | Annotated disfluencies in XML format, with manually corrected mismatches. |
Created: | 2024-07-11 |
By: | Peter, Darinka |
Origin | https://dsplab-redmine.ietk.um.si/documents/30 |
Description: | Here are the .textgrid files with the annotations of prosodic units. The prosodic units are annotated in the 15th tier, which is the only relevant tier for the unified output files for Iriss (Nikola's group and Darinka). Tier 15 contains the token IDs, grouped together to form prosodic units (PUs). As per our test run with Peter in February, I have tried to weed out all the token anomalies, but it is still likely that some tokens are missing or even duplicated. In such cases, let me know to fix it. Tiers 16 and 17 are relevant to the analyses yet to be performed in group A3.1 (Simon and Jerneja). The 6/57 files with the ending "-no-arg" do not contain tiers 16 and 17. An e-mail with further steps in group A3.1 will follow next week (June 17-21st). |
Created: | 2024-06-21 |
By: | Simona |
Origin | https://dsplab-redmine.ietk.um.si/attachments/download/276/Iriss-disfl-anno-phase5-fin.zip |
Description: | Annotated disfluencies in XML format. |
Created: | 2024-04-11 |
By: | Darinka |
Origin | https://github.com/UniversalDependencies/UD_Slovenian-SST.git |
Description: | SST, as updated and extended by Kaja et al. |
Created: | 2024-04-10 |
By: | Kaja et al. |
Origin | https://dsplab-redmine.ietk.um.si/attachments/download/211/iriss_with_w_and_pauses-corrected.zip |
Description: | IRISS in TEI and EXB with word alignment, manually corrected for overlapping speech. Also contains TRS with no word alignment. |
Created: | 2024-02-12 |
By: | Peter |
Origin | https://www.clarin.si/repository/xmlui/bitstream/handle/11356/1863/Gos.TEI.zip |
Description: | Original GOS TEI 2.1 |
Created: | 2023-08-28 |
By: | Darinka et al |
Origin | https://nl.ijs.si/nikola/mezzanine/ |
Description: | Iriss dataset, as created initially. No word timeline. TEI only. |
Created: | 2023-09-06 |
By: | Peter |
Origin | https://nl.ijs.si/nikola/mezzanine/ |
Description: | SPOG dataset in TEI format |
Created: | 2023-09-06 |
By: | Peter |
Origin | https://nl.ijs.si/nikola/mezzanine/ |
Description: | SST, as generated initially. TEI only. |
Created: | 2023-09-06 |
By: | Peter |
Origin | |
Description: | |
Created: | |
By: |