-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ES-CT: translations and data update #777
Conversation
Data-main
sync with ParlaMint data branch
@rjzevallos, should we use listPerson, which has been pushed with a pull request for the whole corpus in the release? |
@matyaskopp, Yes, you can use this new listPerson |
@matyaskopp, I see that we have a lot of form and syntax warnings, how can we fix that? |
Well, maybe @matyaskopp has some better idea, but, in short, you should get a better parser, because the one you are using produces very many illegal UD parses. |
The problem is that ES-CT uses pre-tokenized and pre-sententized input for UDPipe. UDPipe is quite bad for this kind of input. This is probably the reason for the multiple roots: |
No description provided.