-
-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RankFind does not use Levenshtein distance to build results #10
Comments
I actually played around with something like that for Say you search for "aaa" in "aaabbbbbbbbbbbbbbbbbbb" and your threshold is 15 it wouldn't match. Perhaps that's exactly what you wanted, but I think it might have to be a bit smarter and take length into consideration as well or something... This is basically when I went with the current solution of Another approach would be to create a |
Then you'd have something like this const threshold = 15
func predicate(s, t string) bool {
distance := LevenshteinDistance(s, t) // 19
return distance < threshold
}
fuzzy.RankFindFunc("aaa", "aaabbbbbbbbbbbbbbbbbbb", predicate) // false |
Interesting, thank you for the explanations! On Wed, Sep 9, 2015, 6:08 AM Peter Renström [email protected]
|
Yeah, so I'm not sure the distance alone is enough to determine a match. Having "a" match "ab" but not "abc" is confusing, it should give at least as many hits as a plain old substring search. Did you have a good use-case for this? Perhaps I'm just missing something here 😄 |
Yeah, so I use it to create a search engine for my GoCourseSort project, and the reason I use Levenshtein instead of simple match, is so a word in the search query matches a keyword from the database of strings. I think it is vaguely similar to the way BigTables work (don't quote me). Imagine book titles: "Gone with the Wind", "Gone Girl", and "The Girls", we create an index of key words with references: gonegirl := &Book{
title: "Gone Girl",
}
gonewiththewind := &Book{
title: "Gone with the Wind",
}
thegirls := &Book{
title: "The Girls",
}
indexByKeywords := map[string][]*Book {
"girls": { thegirls },
"girl": { gonegirl },
"gone": { gonegirl, gonewiththewind },
"with": { gonewiththewind },
"the": { gonewiththewind, thegirls },
"wind": { gonewiththewind },
} Now you enter the search term "the girls" Then I'm Levenshtein ranking "the" against the keys of This is important because in my search engine, I don't want order of words to be important in any way, but I need "CS" to match "CSC" and "131" to match "121", because I'm using it for a college course catalog. |
You might want to look at my implementation and the fuzzy find |
RankFind
still uses plainFind
to determine initial results, then measures Levenshtein distance afterwards. Maybe we can have anotherRankFindLevenshtein
that uses Levenshtein distance to determine results.The text was updated successfully, but these errors were encountered: