Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: hardcode_no_ct algorithm #40

Closed
rammprasad opened this issue Feb 7, 2024 · 3 comments · Fixed by #41
Closed

Feature Request: hardcode_no_ct algorithm #40

rammprasad opened this issue Feb 7, 2024 · 3 comments · Fixed by #41
Assignees
Labels
enhancement New feature or request

Comments

@rammprasad
Copy link
Collaborator

rammprasad commented Feb 7, 2024

Feature Idea

The hardcode_no_ct algorithm will be implemented as a function. As referred in the documentation, this will be used to hardcode a value.

Algorithm Description - Mapping a hardcoded value to a target SDTM variable that has no terminology restrictions.

Example mappings -
FA.FASCAT = ‘COVID-19 PROBABLE CASE’
CM.CMTRT = ‘FLUIDS’
CM.CMCAT = 'GENERAL CONCOMITANT MEDICATIONS'

function call

hardcode_no_ct(raw_dataset,
raw_variable,
target_sdtm_variable, 
target_hardcoded_value,
target_dataset,
merge_to_topic_by )

Input:
raw_dataset - R dataframe. Usually, the raw dataset.

raw_variable - A Character string. Name of the variable in the raw dataset

target_sdtm_variable - A Character string. Name of the SDTM variable that has to be derived

target_hardcoded_value - A Character string. The hardcoded text.

target_dataset - Optional parameter. This is the target_dataset that was created in the previous step.

merge_to_topic_by - Optional parameter. A vector with the string that will be used to merge to the target_dataset

Output:
A dataframe with oak_id_vars and target_sdtm_variable if target_dataset & merge_to_topic_by are not provided
target_dataset with one additional variable target_sdtm_variable

Relevant Input

sdtm spec

study_number raw_source_model raw_dataset raw_dataset_ordinal raw_dataset_label raw_variable raw_variable_label raw_variable_ordinal raw_variable_type raw_data_format raw_codelist study_specific annotation_ordinal mapping_is_dataset annotation_text target_sdtm_domain target_sdtm_variable target_sdtm_variable_role target_sdtm_variable_codelist_code target_sdtm_variable_controlled_terms_or_format target_sdtm_variable_ordinal origin mapping_algorithm entity_sub_algorithm target_hardcoded_value target_term_value target_term_code condition_ordinal condition_group_ordinal condition_left_raw_dataset condition_left_raw_variable condition_left_sdtm_domain condition_left_sdtm_variable condition_operator condition_right_text_value condition_right_sdtm_domain condition_right_sdtm_variable condition_right_raw_dataset condition_right_raw_variable condition_next_logical_operator merge_type merge_left merge_right merge_condition unduplicate_keys groupby_keys target_resource_raw_dataset target_resource_raw_variable
lp_study e-CRF MD1 27 Concomitant Medications MDRAW Medication 3 LongText $200 NA FALSE 1 FALSE CM.CMTRT CM CMCAT Grouping Qualifier NA NA 10 CRF HARDCODE_NO_CT NA GENERAL CONCOMITANT MEDICATIONS NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

raw_datasaet = MD1

oak_id raw_source patient_number MDRAW
1 MD1 PATNUM BABY ASPIRIN
2 MD1 PATNUM CORTISPORIN
3 MD1 PATNUM ASPIRIN
4 MD1 PATNUM DIPHENHYDRAMINE HCL
5 MD1 PATNUM PARCETEMOL
6 MD1 PATNUM VOMIKIND
7 MD1 PATNUM ZENFLOX OZ
8 MD1 PATNUM AMITRYPTYLINE
9 MD1 PATNUM BENADRYL
10 MD1 PATNUM DIPHENHYDRAMINE HYDROCHLORIDE
11 MD1 PATNUM TETRACYCLINE
12 MD1 PATNUM BENADRYL
13 MD1 PATNUM SOMINEX
14 MD1 PATNUM ZQUILL

raw_variable = "MDRAW"

target_variable = "CMCAT"

target_dataset = cm_inter - Let's assume CMTRT, CMINDC variables are already derived and CMCAT is the third variable being processed

oak_id raw_source patient_number CMTRT CMINDC
1 MD1 PATNUM BABY ASPIRIN NA
2 MD1 PATNUM CORTISPORIN NAUSEA
3 MD1 PATNUM ASPIRIN ANEMIA
4 MD1 PATNUM DIPHENHYDRAMINE HCL NAUSEA
5 MD1 PATNUM PARCETEMOL PYREXIA
6 MD1 PATNUM VOMIKIND VOMITINGS
7 MD1 PATNUM ZENFLOX OZ DIARHHEA
8 MD1 PATNUM AMITRYPTYLINE COLD
9 MD1 PATNUM BENADRYL FEVER
10 MD1 PATNUM DIPHENHYDRAMINE HYDROCHLORIDE LEG PAIN
11 MD1 PATNUM TETRACYCLINE FEVER
12 MD1 PATNUM BENADRYL COLD
13 MD1 PATNUM SOMINEX COLD
14 MD1 PATNUM ZQUILL PAIN

merge_to_topic_by - oak_id_vars

Relevant Output

Option 1 - When the function call is

hardcode_no_ct(
raw_dataset = MD1,
raw_variable = "MDRAW",
target_sdtm_variable = "CMCAT", 
target_hardcoded_value = "GENERAL CONCOMITANT MEDICATIONS",
target_dataset = cm_inter,
merge_to_topic_by = c("oak_id","raw_source","patient_number"))

output dataset from the function

oak_id raw_source PATIENT_NUM CMTRT CMINDC CMCAT
1 MD1 PATNUM BABY ASPIRIN NA GENERAL CONCOMITANT MEDICATIONS
2 MD1 PATNUM CORTISPORIN NAUSEA GENERAL CONCOMITANT MEDICATIONS
3 MD1 PATNUM ASPIRIN ANEMIA GENERAL CONCOMITANT MEDICATIONS
4 MD1 PATNUM DIPHENHYDRAMINE HCL NAUSEA GENERAL CONCOMITANT MEDICATIONS
5 MD1 PATNUM PARCETEMOL PYREXIA GENERAL CONCOMITANT MEDICATIONS
6 MD1 PATNUM VOMIKIND VOMITINGS GENERAL CONCOMITANT MEDICATIONS
7 MD1 PATNUM ZENFLOX OZ DIARHHEA GENERAL CONCOMITANT MEDICATIONS
8 MD1 PATNUM AMITRYPTYLINE COLD GENERAL CONCOMITANT MEDICATIONS
9 MD1 PATNUM BENADRYL FEVER GENERAL CONCOMITANT MEDICATIONS
10 MD1 PATNUM DIPHENHYDRAMINE HYDROCHLORIDE LEG PAIN GENERAL CONCOMITANT MEDICATIONS
11 MD1 PATNUM TETRACYCLINE FEVER GENERAL CONCOMITANT MEDICATIONS
12 MD1 PATNUM BENADRYL COLD GENERAL CONCOMITANT MEDICATIONS
13 MD1 PATNUM SOMINEX COLD GENERAL CONCOMITANT MEDICATIONS
14 MD1 PATNUM ZQUILL PAIN GENERAL CONCOMITANT MEDICATIONS

Option 2 - When used without merging

hardcode_no_ct(
raw_dataset = MD1,
raw_variable = "MDRAW",
target_sdtm_variable = "CMCAT", 
target_hardcoded_value = "GENERAL CONCOMITANT MEDICATIONS")

Output dataset

oak_id raw_source PATIENT_NUM CMCAT
1 MD1 PATNUM GENERAL CONCOMITANT MEDICATIONS
2 MD1 PATNUM GENERAL CONCOMITANT MEDICATIONS
3 MD1 PATNUM GENERAL CONCOMITANT MEDICATIONS
4 MD1 PATNUM GENERAL CONCOMITANT MEDICATIONS
5 MD1 PATNUM GENERAL CONCOMITANT MEDICATIONS
6 MD1 PATNUM GENERAL CONCOMITANT MEDICATIONS
7 MD1 PATNUM GENERAL CONCOMITANT MEDICATIONS
8 MD1 PATNUM GENERAL CONCOMITANT MEDICATIONS
9 MD1 PATNUM GENERAL CONCOMITANT MEDICATIONS
10 MD1 PATNUM GENERAL CONCOMITANT MEDICATIONS
11 MD1 PATNUM GENERAL CONCOMITANT MEDICATIONS
12 MD1 PATNUM GENERAL CONCOMITANT MEDICATIONS
13 MD1 PATNUM GENERAL CONCOMITANT MEDICATIONS
14 MD1 PATNUM GENERAL CONCOMITANT MEDICATIONS

Reproducible Example/Pseudo Code

library(sdtm.oak)
library(dplyr)

cm <- cm_daw_data |>
  # Derive topic variable
  assign_no_ct(
    raw_dataset = MD1, 
    raw_variable = MDRAW,
    target_sdtm_var = CMTRT
  )  |>
  assign_no_ct(
    raw_dataset = MD1,
    raw_variable = MDIND,
    target_sdtm_var = CMINDC,
    merge_to_topic_by = c("oak_id","raw_source","patient_number")
  ) |>
hardcode_no_ct(
raw_dataset = MD1,
raw_variable = "MDRAW",
target_sdtm_variable = "CMCAT", 
target_hardcoded_value = "GENERAL CONCOMITANT MEDICATIONS",
target_dataset = cm_inter,
merge_to_topic_by = c("oak_id","raw_source","patient_number"))

Option 2 - Just to derive CMCAT

cm <- cm_daw_data |>
hardcode_no_ct(
raw_dataset = MD1,
raw_variable = "MDRAW",
target_sdtm_variable = "CMCAT", 
target_hardcoded_value = "GENERAL CONCOMITANT MEDICATIONS")
@rammprasad rammprasad added the enhancement New feature or request label Feb 7, 2024
@github-project-automation github-project-automation bot moved this to Product Backlog in sdtm.oak R package Feb 7, 2024
@rammprasad
Copy link
Collaborator Author

@ramiromagno - Please take a look. If this is ok, I will create similar requirements for assign_ct, assign_no_ct

@ramiromagno
Copy link
Collaborator

@rammprasad : Thank you for the examples. I've quickly prepared some draft code for hardcode_no_ct() in PR #41. The idea is to quickly get your feedback on the gist of it to see if it aligns with the expected functionality. If I got it right, then I can create assertions for argument checking, and polish the code here and there, and make a proper PR.

@ramiromagno
Copy link
Collaborator

@ramiromagno - Please take a look. If this is ok, I will create similar requirements for assign_ct, assign_no_ct

Yes, this is perfect. Please do the same for assign_ct() and assign_no_ct().

@ramiromagno ramiromagno moved this from Product Backlog to In Progress in sdtm.oak R package Feb 13, 2024
@ramiromagno ramiromagno self-assigned this Feb 16, 2024
@ramiromagno ramiromagno moved this from In Progress to In review in sdtm.oak R package Apr 9, 2024
@ramiromagno ramiromagno moved this from In review to Done in sdtm.oak R package May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
2 participants