Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search for "not (word)" by itself returns no results #938

Open
cellio opened this issue Jan 9, 2023 · 9 comments
Open

Search for "not (word)" by itself returns no results #938

cellio opened this issue Jan 9, 2023 · 9 comments
Labels
area: backend Changes to server-side code complexity: unassessed Needs further developer investigation before complexity/feasibility can be determined. priority: medium type: bug Something isn't working

Comments

@cellio
Copy link
Member

cellio commented Jan 9, 2023

https://meta.codidact.com/posts/287031

If you use the search "-" operator with a word, the search returns no results even when it shouldn't. I just reproduced the case described in the report: on Code Golf, in the sandbox, search for -finalized.

I commented at the time of the report that I thought some other changes in progress would fix that too, but either they didn't land or they did land but didn't fix it. I don't now remember what I was thinking of (maybe #834 ?).

@cellio cellio added area: backend Changes to server-side code type: bug Something isn't working priority: medium complexity: unassessed Needs further developer investigation before complexity/feasibility can be determined. labels Jan 9, 2023
@trichoplax
Copy link
Contributor

I've just checked the other usage of a hyphen (-) in search. The search help mentions that it can be used in the form -tag:tag-name to exclude that tag.

I've tested and confirmed tags are excluded correctly, so it's only excluding words that doesn't work.

@trichoplax
Copy link
Contributor

When I originally raised this, it was only a problem for hyphen-only searches like "-finalized". Using a hyphen worked fine when combined with another search term. This is still the case, so this problem may be slightly smaller than it would otherwise sound.

This suggests that it might be the lack of a positive term, rather than the presence of a negative term, that stops the search returning any results.

@trichoplax
Copy link
Contributor

In testing in preparation for the previous comments, I've noticed a different but closely related problem with hyphen search terms. Not sure whether to raise it separately or just mention it here:

It appears that not all words are indexed, so a search for a non-indexed word such as "the" will give no results. This is a little confusing since there are plenty of posts containing "the", but at least the lack of results is very visible to the user.

However, a search to exclude a non-indexed word, such as "integer -the" does not make obvious to the user that "the" has not been excluded. It looks at first glance like the search has worked, but it has only half worked - the results contain "integer", but they do not exclude "the".

@cellio cellio changed the title Search for "not (word)" returns no results Search for "not (word)" by itself returns no results Jan 9, 2023
@cellio
Copy link
Member Author

cellio commented Jan 9, 2023

Thanks for the further investigation. I wonder what search should do (and does elsewhere) if you search only for stop words like "the". Hmm. Ideally, you'd get told something like "no valid search terms", and we probably need to add something about stop words to the search help.

@trichoplax
Copy link
Contributor

If Codidact is going to be using Elasticsearch at some point, it may be that stop words may stop being a problem. I have no experience with Elasticsearch, and this article is almost a decade old, but it suggests that Elasticsearch precludes the need to manually identify stop words, or to exclude them from the index: Stop stopping stop words

@cellio
Copy link
Member Author

cellio commented Jan 9, 2023

Oh, that'll simplify things. Using Elasticsearch is in progress on #834 (draft, still work to do).

@ArtOfCode-
Copy link
Member

I believe this behaviour is correct. Without actually looking at the MySQL documentation, I believe fulltext indexes need a positive search term to search for, and can't just search on a negative.

@cellio
Copy link
Member Author

cellio commented Jan 17, 2023

If this is correct behavior, can we detect the situation and show something like "search requires a leas one positive term" on the results page?

@trichoplax
Copy link
Contributor

If there's a way to reliably detect this on the client side, we could even avoid sending a request to the server.

Not sure a simple way to detect this that doesn't get confused by spaces inside quotes though, like searching for -term -"two terms".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: backend Changes to server-side code complexity: unassessed Needs further developer investigation before complexity/feasibility can be determined. priority: medium type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants