TCDC CQS #17

karafecho · 2022-11-17T22:32:12Z

TCDC ~~CARA~~ Curated Query Service Overview

This issue is intended to initiate implementation work on the Translator Clinical Data Committee (TCDC) ~~Curated ARA (CARA)~~ Curated Query Service. The goal is to create a skeletal ARA that initially will support the TCDC's MVP1 workflow on rare pulmonary disease but eventually will support any workflow developed by the committee. ~~CARA~~ CQS also will provide a general model and approach for other teams, committees, working groups, and external users who wish to contribute an ARA to the Translator ecosystem. The development and implementation work is being supported by the SRI, with Jason Reilly serving as lead developer. Plans for long-term maintenance are TBD.

TCDC ~~CARA~~ CQS Implementation Plan

A detailed implementation plan was developed by Jason F., Arbrar M., Chris B., Casey T., and Kara F. on 11/15/2022 and finalized by those same persons on 11/17/2022. That plan is described below.

TCDC will register within ~~CARA~~ CQS mappings between a template query-graph and one or more TRAPI queries with workflows but without score operations (i.e., a TRAPI message with a query_graph and a workflow element)
- For the ‘treats’ MVP1 question, there will be ~~two such queries, one for Path A and one for Path B~~ one query, Path D for initial deployment, with the more complex paths implemented after testing the initial deployment [revised 03/22/2023)
At runtime, when the registered template query-graph (without a workflow but with a URL for return response) comes in from the ARS, ~~CARA~~ CQS will submit the associated TRAPI queries with workflows but without score operations to the Workflow Runner (WFR) and get back the results
After all results are returned, ~~CARA~~ CQS will use FastAPI Reasoner Pydantic to merge the N sets of results by the result node
~~CARA~~ CQS will then score results using a composite metric TBD, but derived from one or more of the following edge attributes: log_odds_ratio, total_sample_size, and log_odds_ratio_95_ci
[per discussion on 04/12/2023]
The WFR will generate scores for the merged result from multiple ARAs, but rather than generating multiple results (one score per each ARA response), it will put all of the scores into some property on the (one) result, and then generate some kind of half-baked average of the scores from the different ARAs TRAPI 1.4: Each ARA will score results. All of the separate scores generated by each ARA will be presented individually as analyses of the result when returned to the ARS [revised 03/29/2023, per Abrar]
~~- The WFR sends that scored result back to CARA, who returns it to the ARS using URL for return response~~

andrewsu · 2022-12-13T21:08:18Z

Just a note that I think this approach is similar to what we're doing for BTE's creative mode implementation. So for example, any incoming creative mode query gets compared to the template definitions in this "templateGroups file", which currently only has one entry for [Drug] - treats - [Disease]. If the input query matches the subject/object/predicate constraints given, then BTE will plug in the input IDs into a series of hand-curated query templates (which for [Drug] - treats - [Disease] would be in this directory). We'd be happy to explore synergies here in syntax, implementation, or both...

karafecho · 2022-12-14T21:29:58Z

Thanks for alerting me to BTE's creative mode implementation, @andrewsu. This does seem similar to what we're planning for CARA. The main difference may be that the TCDC iteratively refines the TRAPI queries that we develop by reviewing answers and invoking SME input when appropriate.

Yes, let's find a time to discuss the two creative mode implementations. The next TCDC meeting is scheduled for January 4 at 2 pm ET. Any chance you and/or members of your team are free to join that call or the following one on January 18? Alternatively, we can arrange a separate meeting. Just let me know. Thanks!

karafecho · 2022-12-14T21:53:08Z

Actually, the agenda for the January 4 meeting is somewhat full, so the January 18 meeting might be better, or a separate meeting.

andrewsu · 2022-12-15T22:35:12Z

Thanks for alerting me to BTE's creative mode implementation, @andrewsu. This does seem similar to what we're planning for CARA. The main difference may be that the TCDC iteratively refines the TRAPI queries that we develop by reviewing answers and invoking SME input when appropriate.

Yes, I agree that could be the potential synergy -- the review/refinement process planned by TCDC combined with some technical foundation that we've already built through BTE. I will plan on being at the Jan 18 meeting to discuss more!

karafecho · 2023-10-17T16:52:56Z

Update: The initial dev deployment of the CQS was in place and tested prior to the Fall 2023 relay meeting. Goal is to have a new deployment in ci, one which supports the Path A, B, and E queries, before the Winter 2024 code freeze.

karafecho assigned jdr0887 Nov 17, 2022

karafecho changed the title ~~TCDC CARA~~ TCDC CQS Oct 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TCDC CQS #17

TCDC CQS #17

karafecho commented Nov 17, 2022 •

edited

Loading

andrewsu commented Dec 13, 2022

karafecho commented Dec 14, 2022

karafecho commented Dec 14, 2022

andrewsu commented Dec 15, 2022

karafecho commented Oct 17, 2023

TCDC CQS #17

TCDC CQS #17

Comments

karafecho commented Nov 17, 2022 • edited Loading

andrewsu commented Dec 13, 2022

karafecho commented Dec 14, 2022

karafecho commented Dec 14, 2022

andrewsu commented Dec 15, 2022

karafecho commented Oct 17, 2023

karafecho commented Nov 17, 2022 •

edited

Loading