Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

999 set default datanames to ls(data@env) #1004

Merged
merged 24 commits into from
Dec 18, 2023

Conversation

kartikeyakirar
Copy link
Contributor

@kartikeyakirar kartikeyakirar commented Dec 11, 2023

this fixes #999

Here, I am assigning the default dataset of data types as "data.frame" and "MultiAssayExperiment", which can be altered to any other class type as needed.

Edit: added function update_default_dataname, is designed to retrieve and update default datanames from a teal_data object. Purpose of this function is to extract environment object from a teal_data object. If the object does not already have predefined dataset names, the function sets default names based on the environment.

Edit: Based on the suggestion, the init function has been updated to avoid modifying the teal_data object. Instead, if datanames are not specified, it now passes the app through. Moreover, the teal_data_to_filtered_data function has been adjusted to handle cases where datanames are not pre-defined in the teal_data object. In such case, this function assigns default datanames from the environment of the teal_data object.

Assertion Different data types in FilteredData will be addressed in issue insightsengineering/teal.slice#493

R/utils.R Outdated Show resolved Hide resolved
R/utils.R Outdated Show resolved Hide resolved
@kartikeyakirar kartikeyakirar marked this pull request as draft December 11, 2023 12:12
Merge branch '999_default_dataname@main' of https://github.com/insightsengineering/teal into 999_default_dataname@main

# Conflicts:
#	R/utils.R
@kartikeyakirar kartikeyakirar marked this pull request as ready for review December 11, 2023 12:51
@kartikeyakirar
Copy link
Contributor Author

Assertion on FilteredDATA will be addressed in issue insightsengineering/teal.slice#493

@chlebowa
Copy link
Contributor

Update initial comment.

Copy link
Contributor

github-actions bot commented Dec 11, 2023

Unit Tests Summary

    1 files    19 suites   11s ⏱️
202 tests 200 ✔️ 2 💤 0
402 runs  399 ✔️ 3 💤 0

Results for commit f611e55.

♻️ This comment has been updated with latest results.

Copy link
Contributor

github-actions bot commented Dec 11, 2023

badge

Code Coverage Summary

Filename                          Stmts    Miss  Cover    Missing
------------------------------  -------  ------  -------  -------------------------------------------------------------------------------------------------------------------------------
R/dummy_functions.R                  97      88  9.28%    9-71, 93-106, 109-116, 131-146
R/get_rcode_utils.R                  32       1  96.88%   52
R/include_css_js.R                   24       0  100.00%
R/init.R                             80      31  61.25%   114-121, 145, 164-185, 215-217, 219-220
R/landing_popup_module.R             25      25  0.00%    61-87
R/module_filter_manager.R           107      36  66.36%   49-55, 62-70, 79-84, 228, 233-246
R/module_nested_tabs.R              149      58  61.07%   64-137, 153, 205, 227, 253
R/module_snapshot_manager.R         209     157  24.88%   87-99, 127-136, 140-152, 154-161, 168-182, 186-188, 190-195, 198-208, 211-227, 236-251, 265-288, 291-302, 305-311, 325, 346-369
R/module_tabs_with_filters.R         73      33  54.79%   60-95, 127, 140
R/module_teal_with_splash.R          94       4  95.74%   67, 88, 153-154
R/module_teal.R                     141      32  77.30%   68, 71, 158-159, 165, 196, 204-205, 227-259
R/modules_debugging.R                19      19  0.00%    25-45
R/modules.R                         155      26  83.23%   119, 132, 224-227, 241-246, 257-261, 391-434
R/reporter_previewer_module.R        18       2  88.89%   26, 30
R/show_rcode_modal.R                 20      20  0.00%    16-37
R/tdata.R                            53       1  98.11%   153
R/teal_data_module-eval_code.R       27       0  100.00%
R/teal_data_module-within.R           7       0  100.00%
R/teal_data_module.R                  6       0  100.00%
R/teal_reporter.R                    60       5  91.67%   65, 116-117, 120, 137
R/teal_slices-store.R                29       0  100.00%
R/teal_slices.R                      59      12  79.66%   135-148
R/utils.R                           111      27  75.68%   112-139
R/validate_inputs.R                  32       0  100.00%
R/validations.R                      58      37  36.21%   109-371
R/zzz.R                              11       7  36.36%   3-14
TOTAL                              1696     621  63.38%

Diff against main

Filename                       Stmts    Miss  Cover
---------------------------  -------  ------  -------
R/module_teal_with_splash.R       +5       0  +0.24%
R/tdata.R                         -6       0  -0.19%
R/utils.R                         +3       0  +0.68%
TOTAL                             +2       0  +0.04%

Results for commit: f611e55

Minimum allowed coverage is 80%

♻️ This comment has been updated with latest results

Copy link
Contributor

github-actions bot commented Dec 11, 2023

Unit Test Performance Difference

Additional test case details
Test Suite $Status$ Time on main $±Time$ Test Case
module_teal_with_splash 👶 $+0.03$ srv_teal_with_splash_passes_when_datanames_are_empty_with_warning
module_teal_with_splash 💀 $0.02$ $-0.02$ srv_teal_with_splash_throws_when_datanames_are_empty
utils 👶 $+0.00$ teal_data_datanames_returns_datanames_which_are_set_by_teal.data_datanames
utils 👶 $+0.00$ teal_data_datanames_returns_names_of_the_env_s_objects_when_datanames_not_set
utils 👶 $+0.01$ teal_data_to_filtered_data_return_FilteredData_class

Results for commit 4fd528d

♻️ This comment has been updated with latest results.

@pawelru
Copy link
Contributor

pawelru commented Dec 11, 2023

I personally have some mixed feelings about this - not implementation per se but rather a design outlined in the issue. It's because those are implicit modifications of the data object which are pretty much hidden behind the scenes for the end-user. As a general rule, explicit is better than implicit.

As a direct consequence: data object passed to init() is different to the one being used which will make debugging and maintenance more difficult. Being super realistic now - it's not a big deal right now as the delta is relatively small but it's more about creating such precedence which is very likely to grow in the future.
I think it would be much better not to modify data object once it enters the init.

How about moving this into the constructor? This would address the use case of teal_data(a = ..., b = ..., ...) but not teal_data() |> within(...).

Related to this PR:

@chlebowa
Copy link
Contributor

How about moving this into the constructor? This would address the use case of teal_data(a = ..., b = ..., ...) but not teal_data() |> within(...).

And that is the crux of the issue as we keep recommending using within.

I agree about explicit > implicit. BUT we are changing the philosophy a little here.
CURRENTLY datanames is a statement of which variables are datasets and therefore eligible for filtering.
SOON datanames will be a subset of variables that will be directed to filtering if the module call takes datanames = "all". If some datanames are specified in module(), we trust the app dev to name eligible datasets.

@pawelru
Copy link
Contributor

pawelru commented Dec 11, 2023

How about moving this into the constructor? This would address the use case of teal_data(a = ..., b = ..., ...) but not teal_data() |> within(...).

And that is the crux of the issue as we keep recommending using within.

That's a valid point. Moving this to the constructor won't address the issue then...

CURRENTLY datanames is a statement of which variables are datasets and therefore eligible for filtering. SOON datanames will be a subset of variables that will be directed to filtering if the module call takes datanames = "all". If some datanames are specified in module(), we trust the app dev to name eligible datasets.

Yes. I personally have some problems with this as I used to have a different meaning of datanames but it seems that I need to adapt and/or let's discuss it separately. Sticking with the definition you outlined in this thread for clarity.

So if one does not tell which variables from data object are eligible for filtering I think we shouldn't alter that object. Now I'm thinking that this should be an exception in filter panel init. I'm challenging the place / time we do this.

@chlebowa
Copy link
Contributor

At the moment the filter panel can only handle data.frame and MultiAssayExperiment and will crash if handed anything else. Not for long.

As for the (teal_data) object being modified, note that it is disassembled when creating the filter panel and reassembled after filtering. The modules don't receive the entity that the app dev creates. By the datanames are no longer relevant.

@pawelru
Copy link
Contributor

pawelru commented Dec 11, 2023

note that it is disassembled when creating the filter panel

I'm exactly suggesting doing this during that step. When disassembling we should throw a message and use all objects.

@gogonzo
Copy link
Contributor

gogonzo commented Dec 12, 2023

When disassembling we should throw a message and use all objects.

Consequence of this will be triggering each module on change of any dataset (even if dataset isn't used in a module).

So if one does not tell which variables from data object are eligible for filtering I think we shouldn't alter that object. Now I'm thinking that this should be an exception in filter panel init. I'm challenging the place / time we do this.

<teal_module>$datanames indicates which datasets should be sent to the <teal_module>$server when initializing. Filtering is done automatically on these datanames - there is no way to say "i want these filtered" and "these received by the module$server". What you propose seems to be another argument of the module() function, or extension of the teal_slices api.


I would like to highlight importance of datanames and that teal actually handles situation when datanames are not known when the application starts. When all modules have datanames = "all" and init(data = <teal_data_module>) then datanames are not known. datanames are necessary during initialization of the modules and filter panel.

Another scenario, if we consider everything in the teal_data as a dataname then FilteredData could be initialized with unnecessary objects, like temporary objects or functions

@pawelru
Copy link
Contributor

pawelru commented Dec 12, 2023

@gogonzo I'm getting a little confused now

CURRENTLY datanames is a statement of which variables are datasets and therefore eligible for filtering

there is no way to say "i want these filtered"

These are somewhat conflicting statements.

Another scenario, if we consider everything in the teal_data as a dataname then FilteredData could be initialized with unnecessary objects, like temporary objects or functions

Yes. I am expecting that in the above mentioned issue of filter panel - we would either render empty div of throw warning and exclude objects of classes we don't support currently.

Let me rephrase my point - we should avoid writting to data (and any other received arguments) inside init() as this is an implicit object mutation - we should only read from it. Proposed changes include write actions.
IMHO if we want to support an use case of missing datanames, this should be done as a exception handler of a given functionality (filter panel init, module init and any other).

if (<condition>(data)) {
  foo_special_case(data)
} else {
  foo_standard(data)
}

Let me deep dive into the code base so that I could be much more concrete

R/init.R Outdated Show resolved Hide resolved
@kartikeyakirar
Copy link
Contributor Author

When passing datanames from the environment to some modules in teal.gallery apps, some issues are observed. e.g datanames are displayed in the tm_data_table.
image . I tried to fix this by removing the assertion on validate_has_data (its not solution)

also if have function defined in within it fails the app
image

Do we need to modify modules as well to filter out valid data to be displayed ?

@chlebowa
Copy link
Contributor

When passing datanames from the environment to some modules in teal.gallery apps, some issues are observed. e.g datanames are displayed in the tm_data_table.

You have to modify the app by setting the modules's datanames, I assume it's "all" now.

@kartikeyakirar
Copy link
Contributor Author

kartikeyakirar commented Dec 14, 2023

You have to modify the app by setting the modules's datanames, I assume it's "all" now.

yes, I was experimenting with teal.gallery app by not setting datanames and running app. I realized this is app-developer responsibility to set correct datanames.By default, the data names are set to ALL. However, I assumed that when modules were built, the data names were always data frames or multi-assay experiment classes. But now, it can be a function or variable, so that's why I wondered if we needed to modify the modules considering possibilities that if All is set then instead of failing the app how to handle the different datatypes.

@gogonzo
Copy link
Contributor

gogonzo commented Dec 14, 2023

yes, I was experimenting with teal.gallery app by not setting datanames and running app. I realized this is app-developer responsibility to set correct datanames.By default, the data names are set to ALL. However, I assumed that when modules were built, the data names were always data frames or multi-assay experiment classes. But now, it can be a function or variable, so that's why I wondered if we needed to modify the modules considering possibilities that if All is set then instead of failing the app how to handle the different datatypes.

Yes, it is a responsibility of a module to skip unsupported datasets or app developer to correctly specify module$datanames. Even now, teal.modules.general::tm_data_table will fail with MultiAssayExperiment.

Problem is that at this moment teal.gallery apps-code creates some garbage in teal_data class, which fails filter-panel creation. It should be okey with this PR

@chlebowa
Copy link
Contributor

It should be okey with this PR

This will still happen.

image

@kartikeyakirar
Copy link
Contributor Author

It should be okey with this insightsengineering/teal.slice#497

I am running app by loading teal.slice from insightsengineering/teal.slice#497

R/init.R Outdated Show resolved Hide resolved
R/init.R Outdated Show resolved Hide resolved
kartikeyakirar and others added 2 commits December 14, 2023 17:56
Co-authored-by: Dawid Kałędkowski <[email protected]>
Signed-off-by: kartikeya kirar <[email protected]>
@gogonzo
Copy link
Contributor

gogonzo commented Dec 14, 2023

Not setting datanames in the teal_data object causes problems. get_code(datanames) asserts on datanames. Can we relax get_code to allow any "name"?
What do you think @m7pr ?

data <- teal_data() |>
  within({
    set.seed(1)
    library(scda)
    library(airway)
    library(MultiAssayExperiment)
    data(miniACC, envir = environment())
    data(airway, envir = environment())
    any_variable <- "elo"
    ADSL <- synthetic_cdisc_data("latest")$adsl
    ADSL$categorical <- sample(letters[1:3], size = nrow(ADSL), replace = TRUE, prob = c(.1, .3, .6))
    ADTTE <- synthetic_cdisc_data("latest")$adtte
    ADRS <- synthetic_cdisc_data("latest")$adrs
  })

get_code(data, datanames = "ADRS")
# Error in .local(object, deparse, ...) : 
#   Assertion on 'datanames' failed: Must be a subset of the empty set, i.e. also empty.

@chlebowa
Copy link
Contributor

Is this a problem in the app? By the time we get to get_code, datanames have been set by teal.

Copy link
Contributor

@gogonzo gogonzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a problem in the app? By the time we get to get_code, datanames have been set by teal.

It is not a Today's problem. We use get_code_dependency in .datasets_to_data so we are not limited by the fact that @datanames of data object is not set.

Normally, if we use get_code(datanames=) we wouldn't be able to run get_code as in teal we avoid to set default datanames(data) <- names(data@env) when datanames(data) is empty. If we decide to use get_code() in the future, we will be forced to either:

  1. Relax get_code(datanames=) to accept any name, or
  2. Force-set default datanames to data object in early teal stages.

This is not ideal but I don't request @kartikeyakirar to resolve all of this now. At this moment PR addresses what we need.

@gogonzo
Copy link
Contributor

gogonzo commented Dec 15, 2023

@kartikeyakirar could you please add @details section in init documentation explaining datanames process?

R/utils.R Show resolved Hide resolved
@m7pr
Copy link
Contributor

m7pr commented Dec 15, 2023

About datanames assertion for get_code,teal_data https://github.com/insightsengineering/teal.data/blob/main/R/teal_data-get_code.R#L39-L40 I would prefer to Force-set default datanames to data object in early teal stages. so that get_code works only for teal_data objects with proper specification (non empty @code slot and datanames attribute)

@kartikeyakirar
Copy link
Contributor Author

@kartikeyakirar could you please add @details section in init documentation explaining datanames process?

done f611e55

@kartikeyakirar kartikeyakirar merged commit 0a6222f into main Dec 18, 2023
23 checks passed
@kartikeyakirar kartikeyakirar deleted the 999_default_dataname@main branch December 18, 2023 08:10
m7pr added a commit that referenced this pull request Dec 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request]: Set default datanames to ls(data@env)
6 participants