Skip to content

Repo for tracking resources for the Mezzanine project

Notifications You must be signed in to change notification settings

clarinsi/mezzanine_resources

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mezzanine_resources

Repo for tracking resources for the Mezzanine project

ROG-Artur-train-dev-test-split.csv

Origin Manual selection
Description: 70-10-20 train dev test split for ROG-Artur
Created: 2024-10-01T
By: Darinka

Iriss-DA-disfl-conll-pros

Origin Iriss-disfl-conll-pros with manually annotated Dialog acts
Description: This folder comprises of all current data sources: disfluencies, morpho-syntactic, prosodic units, dialog acts
Created: 2024-10-10T11:54:59
By: Tjaša Vovko, Peter Rupnik

Iriss-disfl-conll-pros

Origin Iriss-disfl-anno-phase5-fin-corr + UD-SST-split + iriss-prosodic-units
Description: Merged data from EXB, CONLL, textgrid (prosodic units)
Created: 2024-07-30T09:20:18
By: Peter, Nikola

UD-SST-split

Origin commit 42738b98a6bcc38458a16a2dcf14d3c134e23539 on UD_Slovenian-SST
Description: SST conllu files, grouped on recording level.
Created: 2024-07-30T09:18:48
By: Peter

Iriss-disfl-anno-phase5-fin-corr

Origin Iriss-disfl-anno-phase5-fin
Description: Annotated disfluencies in XML format, with manually corrected mismatches.
Created: 2024-07-11
By: Peter, Darinka

iriss-prosodic-units

Origin https://dsplab-redmine.ietk.um.si/documents/30
Description: Here are the .textgrid files with the annotations of prosodic units. The prosodic units are annotated in the 15th tier, which is the only relevant tier for the unified output files for Iriss (Nikola's group and Darinka). Tier 15 contains the token IDs, grouped together to form prosodic units (PUs). As per our test run with Peter in February, I have tried to weed out all the token anomalies, but it is still likely that some tokens are missing or even duplicated. In such cases, let me know to fix it. Tiers 16 and 17 are relevant to the analyses yet to be performed in group A3.1 (Simon and Jerneja). The 6/57 files with the ending "-no-arg" do not contain tiers 16 and 17. An e-mail with further steps in group A3.1 will follow next week (June 17-21st).
Created: 2024-06-21
By: Simona

Iriss-disfl-anno-phase5-fin

Origin https://dsplab-redmine.ietk.um.si/attachments/download/276/Iriss-disfl-anno-phase5-fin.zip
Description: Annotated disfluencies in XML format.
Created: 2024-04-11
By: Darinka

UD_Slovenian-SST

Origin https://github.com/UniversalDependencies/UD_Slovenian-SST.git
Description: SST, as updated and extended by Kaja et al.
Created: 2024-04-10
By: Kaja et al.

iriss_with_w_and_pauses

Origin https://dsplab-redmine.ietk.um.si/attachments/download/211/iriss_with_w_and_pauses-corrected.zip
Description: IRISS in TEI and EXB with word alignment, manually corrected for overlapping speech. Also contains TRS with no word alignment.
Created: 2024-02-12
By: Peter

GOS.TEI

Origin https://www.clarin.si/repository/xmlui/bitstream/handle/11356/1863/Gos.TEI.zip
Description: Original GOS TEI 2.1
Created: 2023-08-28
By: Darinka et al

iriss

Origin https://nl.ijs.si/nikola/mezzanine/
Description: Iriss dataset, as created initially. No word timeline. TEI only.
Created: 2023-09-06
By: Peter

SPOG

Origin https://nl.ijs.si/nikola/mezzanine/
Description: SPOG dataset in TEI format
Created: 2023-09-06
By: Peter

SST

Origin https://nl.ijs.si/nikola/mezzanine/
Description: SST, as generated initially. TEI only.
Created: 2023-09-06
By: Peter

Placeholder

Origin
Description:
Created:
By:

About

Repo for tracking resources for the Mezzanine project

Resources

Stars

Watchers

Forks

Packages

No packages published