Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better output for TICCL-unk acronym list #11

Open
kosloot opened this issue Feb 21, 2018 · 0 comments
Open

Better output for TICCL-unk acronym list #11

kosloot opened this issue Feb 21, 2018 · 0 comments
Assignees

Comments

@kosloot
Copy link
Collaborator

kosloot commented Feb 21, 2018

Two different schemes were suggested:

  • acronym ~ acronym frequency in unigram list ~ number of hyphenated compounds the acronym appears in ~ sum of frequencies of the hyphenated compounds the acronym appears in
  • hoeveel beginnen er met dit acroniem? En dan, in tweede instantie: hoeveel tokens leveren deze compounds allemaal op, samen? En deze counts gewoon na de unigram frequentie van de acroniem zonder hyphen.
    @martinreynaert please clarify a bit...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants