Action items for benchmarking UCCA #1

omriabnd · 2020-05-15T13:40:05Z

Build a webpage similar to https://nlpprogress.com/english/semantic_parsing.html#ucca-parsing where we: (1) detailed description of the official evaluation protocol (for different corpora?) including eval scripts + versions, normalization, dataset versions etc.; (2) a leader-board with parser outputs, sorted by UCCA official score and another column where they are evaluated on the MRP metric; (3) a bottom part of the page with links to other (unofficial or legacy) exp setups and corresponding leader boards.
Post the description in (1) as a file in the ucca code repo. Evaluation documentation huji-nlp/ucca#92
Improve UCCA score to more sensible handle unary expansions / multiple categories over the same edge. This will become the new official score. Ask participants of the semeval shared task and conll shared tasks whether they'd like to re-evaluate their systems and post their scores. Evaluation treats multiple categories too leniently huji-nlp/ucca#91
Run the new script on the MRP 2019 and 2020 submitted UCCA parses, after converting them from JSON to XML.

danielhers · 2020-06-10T13:50:53Z

A leaderboard will require running the experiments with leading parsers on the latest data with native UCCA evaluation (not MRP). Maybe @OfirArviv could help.

danielhers · 2020-06-18T09:04:26Z

Also add to https://datasets.quantumstat.com/

omriabnd assigned omriabnd and unassigned omriabnd May 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Action items for benchmarking UCCA #1

Action items for benchmarking UCCA #1

omriabnd commented May 15, 2020 •

edited by danielhers

Loading

danielhers commented Jun 10, 2020

danielhers commented Jun 18, 2020

Action items for benchmarking UCCA #1

Action items for benchmarking UCCA #1

Comments

omriabnd commented May 15, 2020 • edited by danielhers Loading

danielhers commented Jun 10, 2020

danielhers commented Jun 18, 2020

omriabnd commented May 15, 2020 •

edited by danielhers

Loading