a relationship extraction issue #220

karlhugle · 2019-02-01T14:02:57Z

for this list https://github.com/sebastianruder/NLP-progress/blob/master/english/relationship_extraction.md

I would like to point out a data issue

a new model of Distantly Supervised Relationship Extraction using the same training dataset (522611 ) is be able to compare with the same results of models (PCNN+ATT, PCNN+ONE etc.) reported Lin's paper (Lin et al., 2016).
(the cleaned dataset was updated by Lin and could be downloaded from https://github.com/thunlp/NRE)

The problem is that, some new papers (e.g. two in EMNLP 2018 and one in AAAI2019) ) used the unprocessed data (570088), which contains duplicated instances in the test set. the unclean data will give higher unreliable results.

issues already have been discussed in
thunlp/NRE#16
thunlp/OpenNRE#27

the unclean data was tested and has effects on the results.

sebastianruder · 2019-02-05T13:24:54Z

Thanks for this note! Could you add a note to the relevant section and indicate with a symbol the methods that use a different setup?

weilonghu · 2019-03-26T13:51:22Z

I also noticed this problem and emailed some authors, but they have different opinions on this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

a relationship extraction issue #220

a relationship extraction issue #220

karlhugle commented Feb 1, 2019 •

edited

Loading

sebastianruder commented Feb 5, 2019 •

edited

Loading

weilonghu commented Mar 26, 2019

a relationship extraction issue #220

a relationship extraction issue #220

Comments

karlhugle commented Feb 1, 2019 • edited Loading

sebastianruder commented Feb 5, 2019 • edited Loading

weilonghu commented Mar 26, 2019

karlhugle commented Feb 1, 2019 •

edited

Loading

sebastianruder commented Feb 5, 2019 •

edited

Loading