Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scroll-seq requires help when working with a "scan" search type. #72

Open
tedgin opened this issue Feb 20, 2014 · 6 comments
Open

scroll-seq requires help when working with a "scan" search type. #72

tedgin opened this issue Feb 20, 2014 · 6 comments
Labels

Comments

@tedgin
Copy link

tedgin commented Feb 20, 2014

I'm using elasticsearch 0.90.7, so this may be a temporary problem.

A scroll search with a scan search type does not return results with the initial call. A call to the _search/scroll endpoint needs to be made to retrieve the first set of results. This isn't consistent with the behavior of the other search types.

The otherwise very useful scroll-seq endpoint doesn't know about this behavioral inconsistency. When naively called like the following, it assumes the lack of matches returned in the initial search call means there are no matches. scroll-seq returns an empty list.

(scroll-seq (search "index" "mapping" :query (match-all) :search_type "scan"))

There is a work around, make a single scroll call and pass that result set to scroll-sea, but it would be neat if scroll-seq could hide the need to do that for scan search types.

@lorthos
Copy link
Contributor

lorthos commented Mar 3, 2014

Thanks for the heads up, i was dealing with this earlier today..

@emidln
Copy link

emidln commented Oct 1, 2014

Ran into this today. Do we actually need the seq here? https://github.com/clojurewerkz/elastisch/blob/master/src/clojurewerkz/elastisch/rest/document.clj#L221

@michaelklishin
Copy link
Member

@emidln if there are no hits, should we continue scrolling?

@emidln
Copy link

emidln commented Oct 1, 2014

Yes. This is due to ES's scroll/scan api being a special case. The first call returns no hits. We probably could do something along the lines of a multi-arity function where the lower arity calls the higher arity with true on the first call and then the lazy-seq calls the function with false and modify the seq check as appropriate.

@michaelklishin
Copy link
Member

@emidln feel free to submit a pull request :)

@emidln
Copy link

emidln commented Mar 25, 2015

I haven't had time to build up a full pull request, but for anyone who is affected, this gist provides a solution and a higher-level scan/scroll interface: https://gist.github.com/54e3e66715f38befa6da

loganmhb pushed a commit to loganmhb/elastisch that referenced this issue Jul 3, 2015
This takes care of the inconsistency between scroll-seq and ES's
scan-and-scroll API, referenced in issue clojurewerkz#72. The new function makes an
initial request of search_type "scan" to the search api in order to
obtain the first scroll id, makes a second request to retrieve the first
batch of results, and then hands off to scroll-seq to continue lazily
retrieving batches. It is necessary to retrieve one batch initially
because the first request is a special case, as described in issue clojurewerkz#72
-- it always returns no hits.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants