Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What to do with documents without a description #17

Open
siccovansas opened this issue Jan 5, 2018 · 0 comments
Open

What to do with documents without a description #17

siccovansas opened this issue Jan 5, 2018 · 0 comments

Comments

@siccovansas
Copy link
Member

Some documents don't have a description, e.g.: https://api.poliflw.nl/v0/combined_index/1a466b696c8f6861498faff10897cb9c7c011b7f

Why is this? If you go to the source of that document you see that there actually is text available. Did the parsing fail? Was there no text when we scraped it?

What should we do when we end up with no text for a document. Keep it for completeness sake (and users can check the source themselves), or don't save the document at all?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant