Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to Biolink 2.1 (KG2.7.1) - due Aug. 11 #1593

Closed
30 of 31 tasks
amykglen opened this issue Jul 28, 2021 · 47 comments
Closed
30 of 31 tasks

Upgrade to Biolink 2.1 (KG2.7.1) - due Aug. 11 #1593

amykglen opened this issue Jul 28, 2021 · 47 comments
Assignees

Comments

@amykglen
Copy link
Member

amykglen commented Jul 28, 2021

associated code changes should go in the kg2integration branch

first update NodeSynonymizer:

  • change SRI NodeNormalizer URL to Biolink 2.1 endpoint
  • specify version=2.1.0 in Biolink Lookup tool URL
  • adjust NodeSynonymizer conflations
  • adjust any other mentions of ChemicalSubstance
  • try doing a synonymizer build

once we have what seems to be a good synonymizer:

rebuild or edit and put copies in /translator/data/orangeboard/databases/KG2.7.1 on arax.ncats.io:

  • configv2.json (should point to the new KG2c/plover)
    * note: saved this as config_local.json, since we want it to be used over configv2.json during testing
  • KG2c meta knowledge graph
  • NodeSynonymizer
  • KG2c sqlite
  • NGD database
  • COHD database @chunyuma
  • refreshed DTD @chunyuma
  • DTD model @chunyuma
  • DTD database @chunyuma
  • 'slim' databases (used for Travis) @chunyuma / @finnagin

note: As databases are rebuilt, the new copy of config_local.json will need to be updated to point to their new paths.

update ARAX codebase:

  • make updates to the ARAX codebase:
    • change usages of biolink:ChemicalSubstance to biolink:ChemicalEntity (lots of tests use this)
    • update Expand's hard-coded conflations
    • specify version 2.1.0 in every place Expand grabs the Biolink model
  • test everything together (entire ARAX pytest suite should pass when using the new config_local.json - must locally set force_local=True in ARAX_expander.py to avoid using the old KG2 API)

other things:

  • update the test triples that go in some NCATS repo @finnagin
  • update Biolink version (2.1.0) and KG2 version (2.7.1) in openapi yaml @edeutsch ?
  • update SmartAPI registration for ARAX
  • after main roll-out is complete, rename config_local.json to config_local.json_FROZEN_DO-NOT-EDIT-FURTHER (any remaining edits to the config file, such as when the DTD build is complete, should be made directly to the master configv2.json on araxconfig.rtx.ai)
@amykglen
Copy link
Member Author

well the synonymizer build appeared to go well - artifacts are uploaded to arax.ncats.io at /data/orangeboard/databases/KG2.7.1/synonymizer.

only 56 problems listed in Problems.tsv (down from 76 for KG2.6.7.1).

going to do some spot-checking tomorrow to make sure things look ok.

@edeutsch
Copy link
Collaborator

terrific!

@amykglen
Copy link
Member Author

things seem good in the synonymizer based on some spot-checking... although wondering about one thing - is it normal for the SRI_normalizer_category and other SRI fields here to be null?

python3 node_synonymizer.py --lookup CHEMBL.COMPOUND:CHEMBL112
...
    "id": {
      "SRI_normalizer_category": null,
      "SRI_normalizer_curie": null,
      "SRI_normalizer_name": null,
      "category": "biolink:SmallMolecule",
      "identifier": "CHEMBL.COMPOUND:CHEMBL112",
      "name": "ACETAMINOPHEN"
    },
...

@edeutsch
Copy link
Collaborator

It could happen if the concept is not known to the SRI Node Normalizer, but that is not the case here. Looks like a bug. What happens if you run this:

python3 sri_node_normalizer.py -c CHEMBL.COMPOUND:CHEMBL112

in the location where the SRI NN and NodeSyn was built?

@edeutsch edeutsch self-assigned this Jul 29, 2021
@amykglen
Copy link
Member Author

it returns this:

==========================================================
Native SRI Node Normalizer results:
{
  "CHEMBL.COMPOUND:CHEMBL112": {
    "equivalent_identifiers": [
      {
        "identifier": "PUBCHEM.COMPOUND:1983",
        "label": "Acetaminophen"
      },
      {
        "identifier": "CHEMBL.COMPOUND:CHEMBL112",
        "label": "ACETAMINOPHEN"
      },
      {
        "identifier": "UNII:362O9ITL9D",
        "label": "ACETAMINOPHEN"
      },
      {
        "identifier": "CHEBI:46195",
        "label": "paracetamol"
      },
      {
        "identifier": "DRUGBANK:DB00316"
      },
      {
        "identifier": "MESH:D000082",
        "label": "Acetaminophen"
      },
      {
        "identifier": "CAS:103-90-2"
      },
      {
        "identifier": "CAS:360769-21-7"
      },
      {
        "identifier": "DrugCentral:52",
        "label": "paracetamol"
      },
      {
        "identifier": "GTOPDB:5239",
        "label": "paracetamol"
      },
      {
        "identifier": "HMDB:HMDB0001859",
        "label": "Acetaminophen"
      },
      {
        "identifier": "KEGG.COMPOUND:C06804",
        "label": "Acetaminophen"
      },
      {
        "identifier": "INCHIKEY:RZVAJINKPMORJF-UHFFFAOYSA-N"
      }
    ],
    "id": {
      "identifier": "PUBCHEM.COMPOUND:1983",
      "label": "Acetaminophen"
    },
    "type": [
      "biolink:SmallMolecule",
      "biolink:MolecularEntity",
      "biolink:ChemicalEntity",
      "biolink:PhysicalEssence",
      "biolink:NamedThing",
      "biolink:Entity",
      "biolink:PhysicalEssenceOrOccurrent"
    ]
  }
}
==========================================================
Local more compact and useful formatting:
{
  "curie": "CHEMBL.COMPOUND:CHEMBL112",
  "equivalent_identifiers": [
    {
      "identifier": "PUBCHEM.COMPOUND:1983",
      "label": "Acetaminophen"
    },
    {
      "identifier": "CHEMBL.COMPOUND:CHEMBL112",
      "label": "ACETAMINOPHEN"
    },
    {
      "identifier": "UNII:362O9ITL9D",
      "label": "ACETAMINOPHEN"
    },
    {
      "identifier": "CHEBI:46195",
      "label": "paracetamol"
    },
    {
      "identifier": "DRUGBANK:DB00316"
    },
    {
      "identifier": "MESH:D000082",
      "label": "Acetaminophen"
    },
    {
      "identifier": "CAS:103-90-2"
    },
    {
      "identifier": "CAS:360769-21-7"
    },
    {
      "identifier": "DrugCentral:52",
      "label": "paracetamol"
    },
    {
      "identifier": "GTOPDB:5239",
      "label": "paracetamol"
    },
    {
      "identifier": "HMDB:HMDB0001859",
      "label": "Acetaminophen"
    },
    {
      "identifier": "KEGG.COMPOUND:C06804",
      "label": "Acetaminophen"
    },
    {
      "identifier": "INCHIKEY:RZVAJINKPMORJF-UHFFFAOYSA-N"
    }
  ],
  "equivalent_names": [
    "Acetaminophen",
    "ACETAMINOPHEN",
    "ACETAMINOPHEN",
    "paracetamol",
    "Acetaminophen",
    "paracetamol",
    "paracetamol",
    "Acetaminophen",
    "Acetaminophen"
  ],
  "preferred_curie": "PUBCHEM.COMPOUND:1983",
  "preferred_curie_name": "Acetaminophen",
  "status": "OK",
  "type": "biolink:SmallMolecule"
}

@edeutsch
Copy link
Collaborator

hmm, okay, definitely seems like a bug. do you use those fields for anything?

@amykglen
Copy link
Member Author

no, I don't use them. so unless it's indicative of a larger issue, not a problem for the KG2c build.

@amykglen
Copy link
Member Author

amykglen commented Aug 4, 2021

so I think we're at a point now where @chunyuma can start building COHD/DTD databases from KG2.7.1. all necessary files should be on arax.ncats.io at: /data/orangeboard/databases/KG2.7.1

I'm going to work next on loading KG2.7.1c into Plover (may require a little bit of tweaking to remove mixins from the expanded_categories property)

@edeutsch
Copy link
Collaborator

edeutsch commented Aug 4, 2021

I suppose this is an important point that we need resolve before we move forward: what do we do about these pesky mixins?

@edeutsch
Copy link
Collaborator

edeutsch commented Aug 4, 2021

At the moment the category_manager skips biolink:Entity. But I think it is true that all mixins come after this in the list. So it would be an easy tweak to STOP processing at biolink:Entity. This would exclude biolink:Entity and mixins I think.

Do we want to do that?

@edeutsch
Copy link
Collaborator

edeutsch commented Aug 4, 2021

ah, but now Chris has responded that we want mixins in there. From Slack:

Yes, I think we want to include mixins in ancestors / descendents. One of the main reasons to have these mixins is to make querying easy by grouping similar concepts that are not direct ancestors in the model. That relies on applying ancestor logic to mixins....

@chunyuma
Copy link
Collaborator

chunyuma commented Aug 4, 2021

Hi @finnagin, @dkoslicki and @jaredroach,

To build the COHD database and DTD model/database for kg2.7.1c based on the Biolink Model 2.1, I plan to use the biolink:Drug and biolink:SmallMolecule to replace the original bilink:Drug , biolink:ChemicalSubstance and biolink:Metabolite used in kg2.6.7.1 as drug. Based on this, I simply compared the total number of drug nodes in kg2.7.1 (which is 2039306) and the total number of drug nodes used in kg2.6.7.1 (which is 2010011 ). Using only those two classes even have more drug nodes. So any objection for only treating biolink:Drug and biolink:SmallMolecule as drug` for COHD and DTD?

@amykglen
Copy link
Member Author

amykglen commented Aug 5, 2021

well surprisingly it worked to just load KG2c into Plover as is! (with mixins included as labels.) it did increase baseline memory usage by a decent amount, but not enough for it to cause a problem I think.

down the road I may tweak the code a bit to decrease its memory usage, but it seems to be totally fine for now.

@edeutsch
Copy link
Collaborator

edeutsch commented Aug 5, 2021

Terrific! It looks like @chunyuma has several to dos in the list, but otherwise, do we just sit tight until you're back and then make the switch?

@chunyuma
Copy link
Collaborator

chunyuma commented Aug 5, 2021

I've already started building COHD and DTD now. Both of them should be able to complete next week except for DTD probability precomputed database which might need more time.

@amykglen
Copy link
Member Author

amykglen commented Aug 5, 2021

yeah, I think that sounds fine to wait to roll it out until I'm back - but in the meantime other codeowners could go in and fix any usages of biolink:ChemicalSubstance in their code/tests (in the kg2integration branch). (I created a couple subtasks for this above.)

and if anyone wants to test their changes, they just need to:

  1. download the config_local.json from arax.ncats.io:/translator/data/orangeboard/databases/KG2.7.1/config_local.json (put it into your local RTX/code/ directory)
  2. locally change this line in Expand to say force_local = True
  3. run the pytest suite (from the kg2integration branch)

chunyuma added a commit that referenced this issue Aug 11, 2021
@chunyuma
Copy link
Collaborator

Hi @edeutsch and @amykglen, do we have some synonymizer functions or utility functions that can return all children of a category in kg2.7.1? Thanks!

@amykglen
Copy link
Member Author

I just messaged Chunyu about this in slack but posting here as well so others are aware it was addressed: the CategoryManager may work to find category ancestors (see this function) or the Biolink Lookup Service could be used to find category descendants (curl -X GET "https://bl-lookup-sri.renci.org/bl/ChemicalEntity/descendants?version=2.1.0" -H "accept: application/json" - although you may want to cache answers in this case).

@finnagin
Copy link
Member

Do we want to try to implement a local way of getting descendents? I'm asking because we currently we cannot run the DTD tests on Travis because of it's lack of a cache for when it hits the SRI and I'm worried having SRI calls for cohd too might make those tests impossible to run as well.

@amykglen
Copy link
Member Author

actually, I just remembered I already have a local method for category descendants in the KPSelector, so you could use that if you want, @chunyuma. you could create a KPSelector object and call this method:

def _get_category_descendants(self, categories: Optional[List[str]]) -> Set[str]:

@chunyuma
Copy link
Collaborator

I think all tests associated with dtd/chp/chod pass now in the kg2integration branch

@amykglen
Copy link
Member Author

great. for some reason the CHP tests are still failing for me in the kg2integration branch (after pulling):

test_ARAX_expand.py::test_chp_expand_1 FAILED                                                                                           [ 12%]
test_ARAX_expand.py::test_chp_expand_2 FAILED                                                                                           [ 12%]

the errors are about connection reset:

  - 2021-08-11T08:01:57.344012 DEBUG: [] Prefixes CHP supports for ['biolink:Gene', 'biolink:Protein'] are: {'ENSEMBL'}
  - 2021-08-11T08:01:57.346041 DEBUG: [] CHP: Converted n00's 3 curies to a list of 3 curies with prefixes CHP supports
  - 2021-08-11T08:02:07.708723 ERROR: [UncaughtError] An uncaught error was thrown while trying to Expand using CHP. Error was: Traceback (most recent call last):
  File "/Users/amyglen/.pyenv/versions/3.7.8/envs/arax/lib/python3.7/site-packages/urllib3/connectionpool.py", line 677, in urlopen
    chunked=chunked,
  File "/Users/amyglen/.pyenv/versions/3.7.8/envs/arax/lib/python3.7/site-packages/urllib3/connectionpool.py", line 426, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/Users/amyglen/.pyenv/versions/3.7.8/envs/arax/lib/python3.7/site-packages/urllib3/connectionpool.py", line 421, in _make_request
    httplib_response = conn.getresponse()
  File "/Users/amyglen/.pyenv/versions/3.7.8/lib/python3.7/http/client.py", line 1354, in getresponse
    response.begin()
  File "/Users/amyglen/.pyenv/versions/3.7.8/lib/python3.7/http/client.py", line 306, in begin
    version, status, reason = self._read_status()
  File "/Users/amyglen/.pyenv/versions/3.7.8/lib/python3.7/http/client.py", line 267, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/Users/amyglen/.pyenv/versions/3.7.8/lib/python3.7/socket.py", line 589, in readinto
    return self._sock.recv_into(b)
ConnectionResetError: [Errno 54] Connection reset by peer

are you seeing this as well, @chunyuma?

@chunyuma
Copy link
Collaborator

chunyuma commented Aug 11, 2021

It seems like this is a CHP API problem. I will figure it out today.

@chunyuma
Copy link
Collaborator

@amykglen, I met a similar problem like you got when I tested test_chp_expand_1 :

Traceback (most recent call last):
  File "/home/cqm5886/anaconda3/envs/RTX_env/lib/python3.7/site-packages/urllib3/connectionpool.py", line 706, in urlopen
    chunked=chunked,
  File "/home/cqm5886/anaconda3/envs/RTX_env/lib/python3.7/site-packages/urllib3/connectionpool.py", line 394, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/home/cqm5886/anaconda3/envs/RTX_env/lib/python3.7/site-packages/urllib3/connection.py", line 234, in request
    super(HTTPConnection, self).request(method, url, body=body, headers=headers)
  File "/home/cqm5886/anaconda3/envs/RTX_env/lib/python3.7/http/client.py", line 1252, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/home/cqm5886/anaconda3/envs/RTX_env/lib/python3.7/http/client.py", line 1298, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/home/cqm5886/anaconda3/envs/RTX_env/lib/python3.7/http/client.py", line 1247, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/home/cqm5886/anaconda3/envs/RTX_env/lib/python3.7/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/home/cqm5886/anaconda3/envs/RTX_env/lib/python3.7/http/client.py", line 966, in send
    self.connect()
  File "/home/cqm5886/anaconda3/envs/RTX_env/lib/python3.7/site-packages/urllib3/connection.py", line 200, in connect
    conn = self._new_conn()
  File "/home/cqm5886/anaconda3/envs/RTX_env/lib/python3.7/site-packages/urllib3/connection.py", line 182, in _new_conn
    self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fc6259aac90>: Failed to establish a new connection: [Errno 110] Connection timed out

I think the problem might come from chp team's server. Right now, when I called their APIs, I met Connection timed out by running r = requests.get('http://chp.thayer.dartmouth.edu/predicates/').

@chunyuma
Copy link
Collaborator

Hi @yakaboskic, could you please help us take a look if there is something wrong with CHP APIs? Currently, we can't call the chp server by using r = requests.get('http://chp.thayer.dartmouth.edu/predicates/'). Thanks!

@yakaboskic
Copy link

Hi @chunyuma, so I am currently working on a server rebuild right now, but should be back up tonight. However, we have depreciated the predicates endpoint.

@chunyuma
Copy link
Collaborator

Thanks for reply @yakaboskic. Please let us know when it is back. Thanks again!

@finnagin
Copy link
Member

I've created a pull request for the new predicate tuples in the NCATS testing repo. @dkoslicki when you get a chance could you approve and merge it?

@yakaboskic
Copy link

Hi @chunyuma, CHP server is back up as of around 12AM EST last night.

@chunyuma
Copy link
Collaborator

Thanks @yakaboskic!

@chunyuma
Copy link
Collaborator

Hi @yakaboskic, I tried calling the CHP APIs via r = requests.post('http://chp.thayer.dartmouth.edu/query/', json=query) for the following query:

{'message': {'query_graph': {'nodes': {'n0': {'ids': ['MONDO:0007254'],
     'categories': ['biolink:Disease'],
     'constraints': []},
    'n1': {'ids': ['ENSEMBL:ENSG00000162419'],
     'categories': ['biolink:Gene'],
     'constraints': []},
    'n2': {'ids': ['CHEMBL.COMPOUND:CHEMBL88'],
     'categories': ['biolink:Drug'],
     'constraints': []},
    'n3': {'ids': ['EFO:0000714'],
     'categories': ['biolink:PhenotypicFeature'],
     'constraints': []}},
   'edges': {'e0': {'predicates': ['biolink:gene_associated_with_condition'],
     'relation': None,
     'subject': 'n1',
     'object': 'n0',
     'constraints': []},
    'e1': {'predicates': ['biolink:treats'],
     'relation': None,
     'subject': 'n2',
     'object': 'n0',
     'constraints': []},
    'e2': {'predicates': ['biolink:has_phenotype'],
     'relation': None,
     'subject': 'n0',
     'object': 'n3',
     'constraints': [{'name': 'survival_time',
       'id': 'EFO:0000714',
       'operator': '>',
       'value': 500.0,
       'unit_id': None,
       'unit_name': None,
       'not': False}]}}},
  'knowledge_graph': {'nodes': {}, 'edges': {}},
  'results': []},
 'max_results': 10,
 'trapi_version': '1.1',
 'biolink_version': None}

But I got a warning message saying 'Passed category for n2: biolink:Drug, did not match our preferred category biolink:ChemicalSubstance for this curie. Going with our preferred category.' and the returned status is Bad request. See description. with description Problem during interface setup. No CHP core supported queries where found in passed query. Could you please help me see how to solve this issue? I think CHEMBL.COMPOUND:CHEMBL88 belongs to biolink:SmallMolecule based on biolink model 2.1 so it should be assigned to the drug, right?

I generated the standard query by using the following function:

    def _build_standard_query(
            gene=None,
            drug=None,
            outcome=None,
            outcome_name=None,
            outcome_op=None,
            outcome_value=None,
            disease=None,
            trapi_version='1.1',
            ):

        query = "{'message': {'query_graph': {'nodes': {'n0': {'ids': ['" + disease + "'], 'categories': ['biolink:Disease'], 'constraints': []}, 'n1': {'ids': ['" + gene + "'], 'categories': ['biolink:Gene'], 'constraints': []}, 'n2': {'ids': ['" + drug + "'], 'categories': ['biolink:Drug'], 'constraints': []}, 'n3': {'ids': ['" + outcome + "'], 'categories': ['biolink:PhenotypicFeature'], 'constraints': []}}, 'edges': {'e0': {'predicates': ['biolink:gene_associated_with_condition'], 'relation': None, 'subject': 'n1', 'object': 'n0', 'constraints': []}, 'e1': {'predicates': ['biolink:treats'], 'relation': None, 'subject': 'n2', 'object': 'n0', 'constraints': []}, 'e2': {'predicates': ['biolink:has_phenotype'], 'relation': None, 'subject': 'n0', 'object': 'n3', 'constraints': [{'name': '" + outcome_name + "', 'id': '" + outcome + "', 'operator': '" + outcome_op + "', 'value': " + str(outcome_value) + ", 'unit_id': None, 'unit_name': None, 'not': False}]}}}, 'knowledge_graph': {'nodes': {}, 'edges': {}}, 'results': []}, 'max_results': 10, 'trapi_version': '" + trapi_version + "', 'biolink_version': None}"

        return eval(query)

@yakaboskic
Copy link

yakaboskic commented Aug 13, 2021

Hi @chunyuma! Oh no! So... we actually turned off support for our weird muli-hop query structure and have transitioned everyone (we thought) to a fully one hop query structure in order to hopefully help teams better integrate with us. I am sorry that this was not communicated!!

So in light of that function you used to build those standard queries, I have made an equivalent python script to build equivalent one hop queries that will answer the same question as above.

Basically the change is that we have Gene to Disease, Drug to Disease, Gene to Drug, Drug to Gene, and Gene to Gene edges, and you can specify what we call a predicate proxy (EFO for survival) and predicate context (i.e. more gene or drug curies) but you don't have to specify this if you don't want to (we default). Here is the equivalent build script and I have checked and it works. Here is the code below (and also attached as a txt file, can't upload python here).
chp_onehop_simple_build_script.py.txt

Please let me know if you have any questions/concerns/issues! And many apologies for any unexpected work this may have caused!

import requests
import json

def _build_standard_query(
        gene=None,
        drug=None,
        outcome=None,
        outcome_name=None,
        outcome_op=None,
        outcome_value=None,
        disease=None,
        trapi_version='1.1',
        ):

    """Two options, both are equivalent in terms of CHP analysis.
    """
    # Option 1
    query_1 = {
            'message': { 
                'query_graph': {
                    'nodes': {
                        'n0': {
                            'ids': [gene], 
                            'categories': ['biolink:Gene'],
                            'constraints': []}, 
                        'n1': {
                            'ids': [disease],
                            'categories': ['biolink:Disease'],
                            'constraints': []
                            },
                        }, 
                    'edges': {
                        'e0': {
                            'predicates': ['biolink:gene_associated_with_condition'],
                            'relation': None,
                            'subject': 'n0',
                            'object': 'n1',
                            "constraints": [
                                {
                                    "id": "CHP:PredicateProxy",
                                    "not": False,
                                    "name": "predicate_proxy",
                                    "value": [
                                        outcome
                                    ],
                                    "unit_id": None,
                                    "operator": "==",
                                    "unit_name": None
                                    },
                                {
                                    "id": outcome,
                                    "not": False,
                                    "name": outcome_name,
                                    "value": outcome_value,
                                    "unit_id": None,
                                    "operator": outcome_op,
                                    "unit_name": None
                                    },
                                {
                                    "id": "CHP:PredicateContext",
                                    "not": False,
                                    "name": "predicate_context",
                                    "value": [
                                        "drug"
                                    ],
                                    "unit_id": None,
                                    "operator": "==",
                                    "unit_name": None
                                    },
                                {
                                    "id": "drug",
                                    "not": False,
                                    "name": "drug",
                                    "value": [
                                        drug
                                    ],
                                    "unit_id": None,
                                    "operator": "matches",
                                    "unit_name": None
                                    }
                                ]     
                            }
                        },
                    },
                'knowledge_graph': {
                    'nodes': {},
                    'edges': {}
                    }, 
                'results': []
                }, 
                'max_results': 10, 
                'trapi_version': trapi_version, 
                'biolink_version': None
                }

    # Option 2
    query_2 = {
            'message': { 
                'query_graph': {
                    'nodes': {
                        'n0': {
                            'ids': [drug], 
                            'categories': ['biolink:SmallMolecule'],
                            'constraints': []
                            }, 
                        'n1': {
                            'ids': [disease],
                            'categories': ['biolink:Disease'],
                            'constraints': []
                            },
                        },
                    'edges': {
                        'e0': {
                            'predicates': ['biolink:treats'],
                            'relation': None,
                            'subject': 'n0',
                            'object': 'n1',
                            "constraints": [
                                {
                                    "id": "CHP:PredicateProxy",
                                    "not": False,
                                    "name": "predicate_proxy",
                                    "value": [
                                        outcome
                                    ],
                                    "unit_id": None,
                                    "operator": "==",
                                    "unit_name": None
                                    },
                                {
                                    "id": outcome,
                                    "not": False,
                                    "name": outcome_name,
                                    "value": outcome_value,
                                    "unit_id": None,
                                    "operator": outcome_op,
                                    "unit_name": None
                                    },
                                {
                                    "id": "CHP:PredicateContext",
                                    "not": False,
                                    "name": "predicate_context",
                                    "value": [
                                        "gene"
                                    ],
                                    "unit_id": None,
                                    "operator": "==",
                                    "unit_name": None
                                    },
                                {
                                    "id": "gene",
                                    "not": False,
                                    "name": "gene",
                                    "value": [
                                        gene
                                    ],
                                    "unit_id": None,
                                    "operator": "matches",
                                    "unit_name": None
                                    }
                                ]     
                            }
                        },
                    },
                'knowledge_graph': {
                    'nodes': {},
                    'edges': {}
                    }, 
                'results': []
                }, 
                'max_results': 10, 
                'trapi_version': trapi_version, 
                'biolink_version': None
                }
        
    r1 = requests.post('http://chp.thayer.dartmouth.edu/query/', json=query_1)
    r2 = requests.post('http://chp.thayer.dartmouth.edu/query/', json=query_2)
    return r1, r2

if __name__ == '__main__':
    r1, r2 = _build_standard_query(
            gene='ENSEMBL:ENSG00000162419',
            drug='CHEMBL.COMPOUND:CHEMBL88',
            outcome='EFO:0000714',
            outcome_name='EFO:0000714',
            outcome_op=">",
            outcome_value=500,
            disease='MONDO:0007254',
            trapi_version='1.1',
            )
    print(json.dumps(r1.json(), indent=2))
    print(json.dumps(r2.json(), indent=2))

@chunyuma
Copy link
Collaborator

Thanks @yakaboskic! I really appreciate your help for figuring out the issue! I will have a try later today based on your code and suggestions.

chunyuma added a commit that referenced this issue Aug 13, 2021
@chunyuma
Copy link
Collaborator

@amykglen, thanks to the help of @yakaboskic, both chp tests in test_ARAX_expand.py pass now.

test_ARAX_expand.py::test_chp_expand_1 PASSED                                                                                                        [ 50%]
test_ARAX_expand.py::test_chp_expand_2 PASSED                                                                                                        [100%]

@amykglen
Copy link
Member Author

awesome, thanks everyone!

I guess after the full DTD rebuild is done you can go ahead and close this issue, @chunyuma.

@chunyuma
Copy link
Collaborator

Also, thanks @amykglen for developing the biolinkHelper module to access the biolink model information, I replaced the original CategoryManger with BiolinkHelper and it works well.

I guess after the full DTD rebuild is done you can go ahead and close this issue, @chunyuma.
I will complete it as soon as possible.

@amykglen amykglen closed this as completed Sep 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants