Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Concurrent executing query in InnerHitsPhase #16878

Open
kkewwei opened this issue Dec 18, 2024 · 5 comments
Open

[Feature Request] Concurrent executing query in InnerHitsPhase #16878

kkewwei opened this issue Dec 18, 2024 · 5 comments
Assignees
Labels
enhancement Enhancement or improvement to existing feature or request Search:Performance

Comments

@kkewwei
Copy link
Contributor

kkewwei commented Dec 18, 2024

Is your feature request related to a problem? Please describe

In our product environment, we utilize the nested field and inner_hits, the source contains excessive number of fields, and inner_hits query will match about 80 sub documents within a parent document. we find that the FetchPhase is costly, it will cost 3s+, but when we delete the inner_hits, it will only cost 700ms.

In FetchPhase, each document will execute the InnerHitsPhase serially (regards as sub query phase), if we need to
fetch values from multi documents in FetchPahse, then the overall fetch phase will be very slow.

Describe the solution you'd like

Concurrent executing inner_hits in FetchPhase.

In some case, if the source is too large, it will also cost much time to fetch document in FetchPhase, It also seem necessary to fetch doc concurrently in some scenarios.

Related component

Search:Performance

Describe alternatives you've considered

No response

Additional context

No response

@kkewwei kkewwei added enhancement Enhancement or improvement to existing feature or request untriaged labels Dec 18, 2024
@sandeshkr419
Copy link
Contributor

@jed326 / @sohami - Do you guys have some thoughts here?

@jed326
Copy link
Collaborator

jed326 commented Dec 19, 2024

Thanks for the feature request @kkewwei!

Concurrent fetch is something I had briefly discussed with @sohami before, but it's not something we saw any use cases for (until now). When we were looking at concurrent search the performance on aggregations is what we were focusing on, and in a lot of cases users will do aggregations with either size=0 or just the default size so in those so concurrent fetch was not really needed or useful.

I'd be happy to review any designs or PRs for this!

@kkewwei
Copy link
Contributor Author

kkewwei commented Dec 19, 2024

@jed326 / @sohami - Do you guys have some thoughts here?

Thanks for the feature request @kkewwei!

Concurrent fetch is something I had briefly discussed with @sohami before, but it's not something we saw any use cases for (until now). When we were looking at concurrent search the performance on aggregations is what we were focusing on, and in a lot of cases users will do aggregations with either size=0 or just the default size so in those so concurrent fetch was not really needed or useful.

I'd be happy to review any designs or PRs for this!

@jed326, If you haven't started on it, can you assign it to me? I'm pleased to have a try. Of course, I will make a draft first

@jed326
Copy link
Collaborator

jed326 commented Dec 19, 2024

can you assign it to me?

@kkewwei done!

@kkewwei
Copy link
Contributor Author

kkewwei commented Dec 26, 2024

It seems that serial execution of each document's innerhit can speed up due to querycache, while concurrent executing will result in performance degradation. I'm still testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Search:Performance
Projects
Status: 🆕 New
Development

No branches or pull requests

4 participants