Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate PICDB into RICU #5

Merged
merged 9 commits into from
Mar 6, 2024
Merged

Integrate PICDB into RICU #5

merged 9 commits into from
Mar 6, 2024

Conversation

mlondschien
Copy link
Member

@mlondschien mlondschien commented Mar 5, 2024

This PR is a first step towards integrating PICDB into RICU.

What does this PR contribute

This PR adds a picdb.json to the data-sources and adds picdb source entries to the concept-dicts wherever "easy" (see below).

As is, the data can be downloaded (setup_src_data(“picdb”)) and attached (attach_src(“picdb”)). load_concepts(dynamic_vars, "picdb") works (not sure if the result makes sense though).

How to set up ricu with picdb

  • Clone this repo and change into the ricu folder.
  • Create a conda environment from the environment.yml file: mamba env create -f environment.yml (yes, you can install R & its dependencies via conda). Activate it.
  • Open R, then run devtools::load_all(). This is R's equivalent of pip install -e ..
  • Download the data and set it up for usage within ricu:
    • Set Sys.setenv(RICU_DATA_PATH = "/path/to/data") to a location where you can store data.
    • run setup_src_data("picdb") ref. This will ask for a username and password of a physionet account with access rights to picdb 1.1.0 ref.
    • You should then be able to do attach_src("picdb")

Intro to ricu

In ricu, "variables" or "features" are called concepts. A list of concepts is defined in the inst/extdata/config/concept-dict folder, sorted by category, as jsons. Please read Chapter 4 of the official ricu documentation for more information on concepts. For each concept and each data source (e.g., picdb), one needs to define how to retrieve the concept from the data-source tables. There are different approaches, that are described in Chapter 4.1 of the docs. The most common one is id_tbl.

Take for example the hr (heart-rate) concept in the vitals category. In picdb, this is stored in the "table": "chartevents" in the column "val_var": "valuenum" whenever "sub_var": "itemid" contains one of the indices "ids": ["1003"]. The values are merged via "index_var": "charttime" and "subject_id" onto a list of all patients.

This is the entry vitals["hr"]["sources"]["picdb"]:

            "picdb": [
                {
                    "table": "chartevents",
                    "ids": [
                        "1003"
                    ],
                    "sub_var": "itemid"
                }
            ]

Note that "val_var" and "index_var" are missing here. These are specified in inst/ext/config/data-sources/picdb.json[0]["tables"]["chartevents"]["defaults"]:

            "chartevents": {
                "files": "V1.1.0/CHARTEVENTS.csv",
                "defaults": {
                    "time_vars": [
                        "charttime",
                        "storetime"
                    ],
                    "index_var": "charttime",
                    "val_var": "valuenum",
                    "unit_var": "valueuom"
                },
                "num_rows": 886680,
                "cols": {
                    "row_id": {
                        "name": "ROW_ID",
                        "spec": "col_integer"
                    },
                ...
                }

"unit_var" describes the column containing a text description of the unit of the measurement. For hr, this is bpm. This must be the same for all different data-sources. If not, this needs to be converted. See the concept temperature. Here, the values in miiv (MIMIC-IV) corresponding to ids 223761, 224027 are in Fahrenheit and need to be converted to celsius with the convert_unit callback:

            "miiv": [
                {
                    "ids": 223762,
                    "table": "chartevents",
                    "sub_var": "itemid"
                },
                {
                    "ids": [
                        223761,
                        224027
                    ],
                    "table": "chartevents",
                    "sub_var": "itemid",
                    "callback": "convert_unit(fahr_to_cels, 'C', 'f')"
                }

How to match table, ids to concepts?

Matchings between table and ids to concepts are in the D_ICD_DIAGNOSES.csv, D_ITEMS.csv, and D_LABITEMS.csv files. E.g., the first rows of D_ITEMS.csv are

ROW_ID,ITEMID,LABEL_CN,LABEL,LINKSTO,CATEGORY,UNITNAME
1,1001,体温,Temperature,chartevents,Routine Vital Signs,°C
2,1002,脉搏,Pulse,chartevents,,bpm
3,1003,心率,Heart Rate,chartevents,Routine Vital Signs,bpm

This translates, among other into the hr entry for the picdb source above.

Which concepts are integrated with this PR?

I have added source entries (using this script) where either the key of the concept (e.g., hr) or the description value (e.g., "heart rate") matches (case-insensitive) with the column "LABEL" of D_ITEMS.csv or D_LABITEMS.csv. There are still many concepts with an empty entry for picdb, where more understanding of what the variables mean is needed.

There are also entries in D_ITEMS corresponding to the surgery_vital_signs table. Adding these probably requires some more work, I've removed them in this commit.

How can I contribute?

Pick an existing concept from a ricu/inst/extdata/config/concept-dict/*.json. Go through the D_ITEMS, D_LABITEMS, and D_ICD_DIAGNOSES tables and see whether the concept name is mentioned. From this, create an entry below the "sources" key for picdb. Note that picdb has a format similar to mimic and miiv, so you can orient yourself there. Open a PR to this repository, adding the concept (or multiple ones). Please, even if you think this is obvious, write a short text description of your reasoning to add the concept this way, for each concept you add.

If you have any question, please ask!

@mlondschien mlondschien changed the title Add picdb.json Integrate PICDB into RICU Mar 5, 2024
@mlondschien mlondschien marked this pull request as ready for review March 5, 2024 17:22
Copy link
Collaborator

@manuelburger manuelburger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good me, testing right now locally.

@manuelburger
Copy link
Collaborator

If the two tiny changes I made make sense, then this is approved from my side 👍

@mlondschien mlondschien merged commit 8e5c060 into main Mar 6, 2024
1 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants