Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a relationship extraction issue #220

Open
karlhugle opened this issue Feb 1, 2019 · 2 comments
Open

a relationship extraction issue #220

karlhugle opened this issue Feb 1, 2019 · 2 comments

Comments

@karlhugle
Copy link

karlhugle commented Feb 1, 2019

for this list https://github.com/sebastianruder/NLP-progress/blob/master/english/relationship_extraction.md

I would like to point out a data issue

a new model of Distantly Supervised Relationship Extraction using the same training dataset (522611 ) is be able to compare with the same results of models (PCNN+ATT, PCNN+ONE etc.) reported Lin's paper (Lin et al., 2016).
(the cleaned dataset was updated by Lin and could be downloaded from https://github.com/thunlp/NRE)

The problem is that, some new papers (e.g. two in EMNLP 2018 and one in AAAI2019) ) used the unprocessed data (570088), which contains duplicated instances in the test set. the unclean data will give higher unreliable results.

issues already have been discussed in
thunlp/NRE#16
thunlp/OpenNRE#27

the unclean data was tested and has effects on the results.

@sebastianruder
Copy link
Owner

sebastianruder commented Feb 5, 2019

Thanks for this note! Could you add a note to the relevant section and indicate with a symbol the methods that use a different setup?

@weilonghu
Copy link

I also noticed this problem and emailed some authors, but they have different opinions on this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants