-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scroll-seq requires help when working with a "scan" search type. #72
Comments
Thanks for the heads up, i was dealing with this earlier today.. |
Ran into this today. Do we actually need the seq here? https://github.com/clojurewerkz/elastisch/blob/master/src/clojurewerkz/elastisch/rest/document.clj#L221 |
@emidln if there are no hits, should we continue scrolling? |
Yes. This is due to ES's scroll/scan api being a special case. The first call returns no hits. We probably could do something along the lines of a multi-arity function where the lower arity calls the higher arity with true on the first call and then the lazy-seq calls the function with false and modify the seq check as appropriate. |
@emidln feel free to submit a pull request :) |
I haven't had time to build up a full pull request, but for anyone who is affected, this gist provides a solution and a higher-level scan/scroll interface: https://gist.github.com/54e3e66715f38befa6da |
This takes care of the inconsistency between scroll-seq and ES's scan-and-scroll API, referenced in issue clojurewerkz#72. The new function makes an initial request of search_type "scan" to the search api in order to obtain the first scroll id, makes a second request to retrieve the first batch of results, and then hands off to scroll-seq to continue lazily retrieving batches. It is necessary to retrieve one batch initially because the first request is a special case, as described in issue clojurewerkz#72 -- it always returns no hits.
I'm using elasticsearch 0.90.7, so this may be a temporary problem.
A scroll search with a scan search type does not return results with the initial call. A call to the _search/scroll endpoint needs to be made to retrieve the first set of results. This isn't consistent with the behavior of the other search types.
The otherwise very useful scroll-seq endpoint doesn't know about this behavioral inconsistency. When naively called like the following, it assumes the lack of matches returned in the initial search call means there are no matches. scroll-seq returns an empty list.
There is a work around, make a single scroll call and pass that result set to scroll-sea, but it would be neat if scroll-seq could hide the need to do that for scan search types.
The text was updated successfully, but these errors were encountered: