Return original subjects IDs in the imputed datasets #382

nociale · 2022-11-14T18:27:47Z

Would be better if the subjid variable had the same subjects IDs as in the input data

data("antidepressant_data")
dat <- antidepressant_data

dat <- expand_locf(
    dat,
    PATIENT = levels(dat$PATIENT), # expand by PATIENT and VISIT 
    VISIT = levels(dat$VISIT),
    vars = c("BASVAL", "THERAPY"), # fill with LOCF BASVAL and THERAPY
    group = c("PATIENT"),
    order = c("PATIENT", "VISIT")
)
vars <- set_vars(
    outcome = "CHANGE",
    visit = "VISIT",
    subjid = "PATIENT",
    group = "THERAPY",
    covariates = c("BASVAL*VISIT", "THERAPY*VISIT")
)
method <- method_condmean(type = "bootstrap", n_samples = 0)
drawObj <- draws(
    data = dat,
    data_ice = NULL,
    vars = vars,
    method = method,
    quiet = TRUE
)
imputeObj <- impute(drawObj)
d <- extract_imputed_dfs(imputeObj)[[1]]
head(d$PATIENT) # Original IDs
head(dat$PATIENT) # New IDs

Original IDs:

New IDs:

This would be useful if then one wants to do other analyses and needs the original IDs (e.g. to join two datasets based on the IDs)..

Was it necessary to change the IDs?

The text was updated successfully, but these errors were encountered:

gowerc · 2022-11-17T11:56:30Z

It was essential to change the patient IDs to ensure they were unique when you specify the unstructured covariance matrix as otherwise you would be grouping observations across multiple patients who were sampled from the same original patient. If memory serves me right there is an argument to extract_imputed_dfs() that returns an attribute on the dataframe which can be used to map the new names -> old names.

nociale · 2022-11-17T12:48:17Z

Indeed, setting the argument idmap = TRUE will return an attribute on the dataframe. This attribute is a named vector that has values equal to the original IDs and names equal to the new IDs.

Easy way to join the original IDs in an imputed dataset:

d <- extract_imputed_dfs(imputeObj, idmap = TRUE)[[1]]
idmap <- attributes(d)$idmap
d$original_id <- idmap[match(d[[vars$subjid]], names(idmap))]

Thanks a lot.
I will close this issue.

gowerc · 2022-11-17T13:16:09Z

@nociale , Have re-opened the issue as I think it might be worth us adding something more explicit about this in one of the vignettes.

gowerc · 2022-11-17T13:17:10Z

Alternatively, maybe its worth updating the function to add on the "original_id" column instead of just returning the attribute ?

nociale · 2022-11-18T15:51:44Z

Yes, good idea! We could either (1) set by default idmap = TRUE, or (2) return the "original_id" instead of the modified IDs. If we go with the latter, we could remove the argument idmap if it is not needed anymore. My preference is for (2).

nociale added the bug Something isn't working label Nov 14, 2022

nociale mentioned this issue Nov 14, 2022

274 Return frequentist MMRM model object #369

Open

nociale closed this as completed Nov 17, 2022

gowerc reopened this Nov 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return original subjects IDs in the imputed datasets #382

Return original subjects IDs in the imputed datasets #382

nociale commented Nov 14, 2022 •

edited

Loading

gowerc commented Nov 17, 2022

nociale commented Nov 17, 2022

gowerc commented Nov 17, 2022

gowerc commented Nov 17, 2022

nociale commented Nov 18, 2022

Return original subjects IDs in the imputed datasets #382

Return original subjects IDs in the imputed datasets #382

Comments

nociale commented Nov 14, 2022 • edited Loading

gowerc commented Nov 17, 2022

nociale commented Nov 17, 2022

gowerc commented Nov 17, 2022

gowerc commented Nov 17, 2022

nociale commented Nov 18, 2022

nociale commented Nov 14, 2022 •

edited

Loading