Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is a first step towards integrating PICDB into RICU.
What does this PR contribute
This PR adds a
picdb.json
to thedata-sources
and addspicdb
source entries to theconcept-dict
s wherever "easy" (see below).As is, the data can be downloaded (
setup_src_data(“picdb”)
) and attached (attach_src(“picdb”)
).load_concepts(dynamic_vars, "picdb")
works (not sure if the result makes sense though).How to set up ricu with picdb
ricu
folder.environment.yml
file:mamba env create -f environment.yml
(yes, you can install R & its dependencies via conda). Activate it.R
, then rundevtools::load_all()
. This is R's equivalent ofpip install -e .
.Sys.setenv(RICU_DATA_PATH = "/path/to/data")
to a location where you can store data.setup_src_data("picdb")
ref. This will ask for a username and password of a physionet account with access rights to picdb 1.1.0 ref.attach_src("picdb")
Intro to ricu
In ricu, "variables" or "features" are called concepts. A list of concepts is defined in the
inst/extdata/config/concept-dict
folder, sorted by category, as jsons. Please read Chapter 4 of the official ricu documentation for more information on concepts. For each concept and each data source (e.g., picdb), one needs to define how to retrieve the concept from the data-source tables. There are different approaches, that are described in Chapter 4.1 of the docs. The most common one isid_tbl
.Take for example the
hr
(heart-rate) concept in the vitals category. In picdb, this is stored in the"table": "chartevents"
in the column"val_var": "valuenum"
whenever"sub_var": "itemid"
contains one of the indices"ids": ["1003"]
. The values are merged via"index_var": "charttime"
and"subject_id"
onto a list of all patients.This is the entry
vitals["hr"]["sources"]["picdb"]
:Note that
"val_var"
and"index_var"
are missing here. These are specified ininst/ext/config/data-sources/picdb.json[0]["tables"]["chartevents"]["defaults"]
:"unit_var"
describes the column containing a text description of the unit of the measurement. Forhr
, this isbpm
. This must be the same for all different data-sources. If not, this needs to be converted. See the concepttemperature
. Here, the values inmiiv
(MIMIC-IV) corresponding to ids223761, 224027
are in Fahrenheit and need to be converted to celsius with theconvert_unit
callback:How to match
table, ids
to concepts?Matchings between
table
andids
to concepts are in theD_ICD_DIAGNOSES.csv
,D_ITEMS.csv
, andD_LABITEMS.csv
files. E.g., the first rows ofD_ITEMS.csv
areThis translates, among other into the
hr
entry for thepicdb
source above.Which concepts are integrated with this PR?
I have added source entries (using this script) where either the key of the concept (e.g.,
hr
) or thedescription
value (e.g.,"heart rate"
) matches (case-insensitive) with the column"LABEL"
ofD_ITEMS.csv
orD_LABITEMS.csv
. There are still many concepts with an empty entry forpicdb
, where more understanding of what the variables mean is needed.There are also entries in
D_ITEMS
corresponding to thesurgery_vital_signs
table. Adding these probably requires some more work, I've removed them in this commit.How can I contribute?
Pick an existing concept from a
ricu/inst/extdata/config/concept-dict/*.json
. Go through theD_ITEMS
,D_LABITEMS
, andD_ICD_DIAGNOSES
tables and see whether the concept name is mentioned. From this, create an entry below the"sources"
key forpicdb
. Note thatpicdb
has a format similar tomimic
andmiiv
, so you can orient yourself there. Open a PR to this repository, adding the concept (or multiple ones). Please, even if you think this is obvious, write a short text description of your reasoning to add the concept this way, for each concept you add.If you have any question, please ask!