Improve subquery timing #721

tokebe · 2023-09-06T18:04:41Z

Currently, BTE queues subqueries by organizing them into buckets. Each bucket may only carry 3 queries per API, with no limit on the number of APIs per bucket. Buckets are then executed sequentially, with each subsequent bucket waiting for the previous to complete all contained queries.

This setup has advantages -- namely, it prevents BTE from making too many queries to the same API at once. However, it also limits BTE's subquery throughput substantially. In most cases, a bucket will contain only 3 queries to the same API, and then only 3 queries are made at a time, waiting for the longest of each, until all buckets are complete. Additionally, there's no theoretical limit to the number of outbound queries, provided they're all to different APIs (though, in practice, this isn't an issue).

We should instead use a different model that allows for higher throughput. Some APIs provide rate limits, which BTE already attempts to respect (albeit, naively). These should be better taken into account, and for those which are unknown, BTE should better handle 429 responses to automatically determine a limit -- one which could be saved both locally to some global (for local, non-redis installations) and in redis (for live instances). Additionally, BTE should use a number of 'slots' and start new subqueries as slots open up, rather than as the slowest of a bucket completes.

Further consideration for the model to implement can be determined in further discussion after some tinkering has been done.

colleenXu · 2023-09-06T19:35:53Z

I'm wondering if scrolling BioThings queries need special consideration here (I don't know how they relate to "buckets". I'm also not sure that they're included in the logs of how many sub-queries are executed / successful).

tokebe · 2023-09-08T14:53:44Z

Note: we also want to add a (probably configurable) retry system, so if a query errors out for most non-400, non-timeout reasons, it should be recycled back into the queue (probabably only once or twice).

colleenXu · 2023-12-07T21:19:55Z

@tokebe It looks like the related PRs aren't linked. Is that okay? (For example: maybe this one?)

tokebe · 2023-12-07T21:26:37Z

Apologies, bit of an oversight.

Related PRs:

tokebe · 2024-02-21T16:13:08Z

Relevant changes deployed to Prod.

tokebe added the enhancement New feature or request label Sep 6, 2023

tokebe self-assigned this Sep 6, 2023

tokebe added the On Dev Related changes are deployed to Dev server label Nov 28, 2023

tokebe added On CI Related changes are deployed to CI server and removed On Dev Related changes are deployed to Dev server labels Dec 7, 2023

tokebe added On Test Related changes are deployed to Test server and removed On CI Related changes are deployed to CI server labels Dec 20, 2023

tokebe closed this as completed Feb 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve subquery timing #721

Improve subquery timing #721

tokebe commented Sep 6, 2023

colleenXu commented Sep 6, 2023

tokebe commented Sep 8, 2023

colleenXu commented Dec 7, 2023

tokebe commented Dec 7, 2023

tokebe commented Feb 21, 2024

Improve subquery timing #721

Improve subquery timing #721

Comments

tokebe commented Sep 6, 2023

colleenXu commented Sep 6, 2023

tokebe commented Sep 8, 2023

colleenXu commented Dec 7, 2023

tokebe commented Dec 7, 2023

tokebe commented Feb 21, 2024