-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changing Creative mode threshold from results to time #808
Comments
Note that such an implementation may significantly help results/performance from #794 |
After further discussion, we've settled on logic to start testing by:
Implementation requires a few things:
|
@rjawesome I'm assigning this issue to you given your familiarity with the inferred mode handler at this point. As always let us know if you have any questions. |
I discussed the "testing" aspect with Jackson earlier today. We both imagined an automated testing framework to run templates with a list of input IDs, and record run-time info. Other info could also be helpful like: how many MetaEdges, how many subqueries, how long scoring/the NGD step is taking... For lists of input IDs, we could use:
|
Note: Much of https://github.com/biothings/bte-auto-demos could be re-purposed for this kind of testing (removing some of the unneeded automated server framework, re-running, caching, etc.) |
Basic implementation with |
A working version of the automated testing framework has been completed in The creative mode queries that are used have to be placed in the After the script is finished, it will give an output looking like this (for all tempaltes that were ran from the creative mode queries supplied).
(I hard coded it so a timeout [>5 minutes] is recorded as 10 minutes) |
@rjawesome If you make a draft PR, it'll be a little easier to comment on code review. On line 532, you have Additionally, you've currently left in the creative results threshold, which should be removed (I imagine you're getting to that). Otherwise, I like the implementation -- skipping a template when there isn't enough time for it rather than just stopping all template execution there is a good call, and means we could hypothetically run shorter-running but lower-priority templates under some circumstances. |
Seeing that a lot of the queries that will be used for the "testing" aspect have similar query structures w/ changing IDs, I have added an "id template" feature to the "testing" program (performance-test/template_test.js and performance-test/template_test_threaded.js). For example, first.json has the following contents. It will run the query two times in testing, the first time replacing {ID} with MONDO:0002909 and the second time replacing {ID} with MONDO:0019499 {
"message": {
"query_graph": {
"nodes": {
"n0": {
"ids": "{ID}"
},
"n1": {
"categories": ["biolink:Drug"]
}
},
"edges": {
"e0": {
"subject": "n1",
"object": "n0",
"predicates": ["biolink:treats"],
"knowledge_type": "inferred"
}
}
}
},
"ids": ["MONDO:0002909", "MONDO:0019499"]
} |
I think I understand the approach here -- you're eventually going to have to make 2 more templates alongside |
Superseded by #824 |
Currently, BTE will run templates until it reaches 500 results. If BTE takes 4 minutes to retrieve 499 results from template 1, it'll run template 2 and go over time. Simultaneously, if BTE takes 1 minute to retrieve 500 results from template 1, it won't use any remaining time to check if template 2 has better results.
We should instead check if more than x time is remaining (perhaps 2.5 minutes?), and run the next template if so. We could also include a dryrun step to check how many meta-edges are going to be hit by a given template as a heuristic for how long we might expect a template to take, and compare that against time remaining. This should allow BTE to get the best results given the time remaining, even if it sometimes returns <500 results due to time.
This will require further investigation and discussion before any sort of implementation work can be done.
The text was updated successfully, but these errors were encountered: