Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce teal_data class #178

Merged
merged 92 commits into from
Oct 30, 2023
Merged

introduce teal_data class #178

merged 92 commits into from
Oct 30, 2023

Conversation

gogonzo
Copy link
Contributor

@gogonzo gogonzo commented Oct 18, 2023

Part of:

Closes:

Other PRs:

install

staged.dependencies::dependency_table(
  project = "insightsengineering/teal@https://github.com",
  project_type = "repo@host",
  ref = "teal_data@main",
  verbose = 1
) |> staged.dependencies::install_deps()

teal_data class

tdata-qenv, tdata inheritance

  • teal_data class (teal_data-class.R) is en extension of qenv. teal_data contains extra slots datanames and join_keys. Besides, all teal.code methods works with teal_data object (eval_code, print etc.)
  • Old TealData$server returns teal_data now - to have single object type in teal (not in teal_module)
  • cdisc_data() and teal_data() return TealData or teal_data
ADSL <- synthetic_cdisc_data("latest")$adsl
ADTTE <- synthetic_cdisc_data("latest")$adtte
ADRS <- synthetic_cdisc_data("latest")$adrs

data <- cdisc_data(
  ADSL = ADSL,
  ADTTE = ADTTE,
  ADRS = ADRS,
  code = quote({
    ADSL <- synthetic_cdisc_data("latest")$adsl
    ADTTE <- synthetic_cdisc_data("latest")$adtte
    ADRS <- synthetic_cdisc_data("latest")$adrs
  })  
)

data |> eval_code(quote(a <- ADSL))

gogonzo and others added 30 commits August 14, 2023 14:11
`ddl` implementation alternative to #161 .
Complemented by [this
PR](insightsengineering/teal#922).

In order to simplify the user (app dev) experience, I tried to
streamline the logic.

In order to create a `ddl` connector module, one has to:
1. use `input_template` to create the module: enumerate input widgets
2. provide a function, `on_submit`, to be run when `"submit"` button is
clicked; function takes input values wrapped in a list called `input`
and body refers to input values with `input$<value>` or
`input[["<value>"]]`
3. optionally provide mask for input values that will be used in code of
resulting `tdata` object
4. specify names of data sets for compatibility with `teal` (I don't
like it)
5. optionally specify join keys as one would previously, for
compatibility with `teal`; defaults to empty `teal.data::join_keys()`

When inputs are submitted, `on_submit` is passed to a function that
extracts the body, substitutes `input` placeholders with input values
and evaluates the code to obtain data sets in a separate environment.
Then it replaces the input values in the code with ones provided in
`mask` (if any) and uses the environment and the masked code to create
`tdata`.

Much like in the solution proposed on branch `refactor`, the user
provides code to obtain data sets and replacements for input values, and
data is created in separate environment, which is then used to create
`tdata` with masked code.

Unlike that solution, the user specifies everything in one place, rather
than having to define module ui, module server that runs a
post-processing function, the post-processing function itself, etc. This
is easier to understand **for me**.
Another difference is that the user provides code as code with `input$`
references, not text with `glue` syntax (`{ value }`). This is done move
focus to masking rather than have the user think about "online" and
"offline" arguments. It also uses pure base R.


#### MOCK DATABASE CONNECTION
```
pullme <- function(username, password) {
  if (username == "user" && password == "pass") {
    message("connection established")
  } else {
    stop("invalid credentials")
  }
}
closeme <- function() {
  message("connection closed")
}
```
#### MODULE DEFINITION
```
library(shiny)

thefun <- function(input) {
  on.exit(try(closeme()))
  pullme(username = input$user, password = input$pass)
  adsl <- scda::synthetic_cdisc_data('latest')$adsl
  adtte <- scda::synthetic_cdisc_data('latest')$adtte
}
themask <- list(
  user = quote(askpass("who are you?")),
  pass = quote(askpass("password please"))
)
module <- input_template(
  on_submit = thefun,
  mask = themask,
  datanames = c("adsl", "adtte"),
  textInput("user", "username", value = "user", placeholder = "who goes there?"),
  passwordInput("pass", "password", value = "pass", placeholder = "friend or foe?"),
  actionButton("submit", "get it")
)
```
#### AN APP
```
devtools::load_all("../teal.slice")
devtools::load_all("../teal")
devtools::load_all(".")

ui <- fluidPage(
  tagList(
    module$ui("id"),
    uiOutput("val")
  )
)
server <- function(input, output, session) {
  tdata <- module$server("id")
  output[["value"]] <- renderPrint({
    tdata()
  })
  output[["code"]] <- renderPrint({
    cat(teal.code::get_code(tdata()), sep = "\n")
  })
  output[["val"]] <- renderUI({
    req(tdata())
    tagList(
      verbatimTextOutput("value"),
      verbatimTextOutput("code")
    )
  })
}
if (interactive()) shinyApp(ui, server)
```


#### A TEAL APP
```
funny_module <- function (label = "Filter states", datanames = "all") {
  checkmate::assert_string(label)
  module(
    label = label,
    datanames = datanames,
    ui = function(id, ...) {
      ns <- NS(id)
      div(
        h2("The following filter calls are generated:"),
        verbatimTextOutput(ns("filter_states")),
        verbatimTextOutput(ns("filter_calls")),
        actionButton(ns("reset"), "reset_to_default")
      )
    },
    server = function(input, output, session, data, filter_panel_api) {
      checkmate::assert_class(data, "tdata")
      observeEvent(input$reset, set_filter_state(filter_panel_api, default_filters))
      output$filter_states <-  renderPrint({
        logger::log_trace("rendering text1")
        filter_panel_api %>% get_filter_state()
      })
      output$filter_calls <- renderText({
        logger::log_trace("rendering text2")
        attr(data, "code")()
      })
    }
  )
}

devtools::load_all("../teal.slice")
devtools::load_all("../teal")
devtools::load_all(".")

app <- init(
  data = module,
  modules = modules(
    funny_module("funny1"),
    funny_module("funny2", datanames = "adtte") # will limit datanames to ADTTE and ADSL (parent)
  )
)
shinyApp(app$ui, app$server)
```
Fixes the broken tests due to the teal_data refactor

---------

Co-authored-by: go_gonzo <[email protected]>
@chlebowa
Copy link
Contributor

Even though this branch is green, these tests fail on my machine 🤔

Failure (test-TealDatasetConnector.R:567:3): code_dataset_connector - library calls
`data <- cdisc_data(adsl, adtte, adrs, check = TRUE)` did not produce any warnings.
Failure (test-TealDatasetConnector.R:567:3): code_dataset_connector - library calls
`data <- cdisc_data(adsl, adtte, adrs, check = TRUE)` did not produce any warnings.

chlebowa and others added 2 commits October 27, 2023 12:38
Adds function `datanames` to get or set the contents of the `@datanames`
slot in `teal_data` objects.

---------

Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Copy link
Contributor

@averissimo averissimo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

teal_data looks very good at the current state!

Some minor comments, below.. The test/testhat/test-TealDatasetConnector.R is the only one that seems important to merge.


My only big grip is one layer down with using S4 classes, in part by having exposed slots in qenv and by consequence on teal_data class.

It seems more beneficial to keep those fields as private and using active bindings to "expose" those as read-only.

However, converting qenv to R6 is a whole new big project and may introduce caveats as well (hence the use of S4... this predates my joining of the project and I'm not familiar with all the reasons).

NEWS.md Outdated Show resolved Hide resolved
R/get_join_keys.R Show resolved Hide resolved
R/get_join_keys.R Show resolved Hide resolved
tests/testthat/test-TealDatasetConnector.R Outdated Show resolved Hide resolved
@averissimo suggestions

Co-authored-by: André Veríssimo <[email protected]>
Signed-off-by: Dawid Kałędkowski <[email protected]>
@gogonzo gogonzo enabled auto-merge (squash) October 27, 2023 14:09
NEWS.md Show resolved Hide resolved
R/JoinKeys.R Outdated Show resolved Hide resolved
R/JoinKeys.R Show resolved Hide resolved
R/JoinKeys.R Outdated
Comment on lines 406 to 407
#' The `dataname` is extrapolated from the name (or fallback to the value itself if
#' it's a `character(1)`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not entirely clear.

R/JoinKeys.R Outdated
Comment on lines 434 to 442
if (name %in% names(default_cdisc_keys)) {
# Set default primary keys
keys_list <- default_cdisc_keys[[name]]
join_keys[name] <- keys_list$primary

if (!is.null(keys_list$parent) && !is.null(keys_list$foreign)) {
join_keys[name, keys_list$parent] <- keys_list$foreign
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block should go into the block in line 424. The logic will be reflected better and the return(NULL) statements can be removed. The last case becomes redundant altogether.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not true. If name is not null then previous if-else doesn't exit (don't return NULL)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this?

    if (checkmate::test_class(item, "JoinKeySet")) {
      join_keys$set(item)
    } else {
      if ((is.null(name) || identical(trimws(name), "")) && is.character(item)) {
        name <- item
      } 
      if (name %in% names(default_cdisc_keys)) {
        # Set default primary keys
        keys_list <- default_cdisc_keys[[name]]
        join_keys[name] <- keys_list$primary
        
        if (!is.null(keys_list$parent) && !is.null(keys_list$foreign)) {
          join_keys[name, keys_list$parent] <- keys_list$foreign
        }
      }
    }

On another note, I think using lapply like this is missing the point of lapply. A loop could actually be easier to read and we wouldn't have to consider return(NULL)s.

R/teal_data-class.R Show resolved Hide resolved
R/utils.R Outdated Show resolved Hide resolved
R/teal_data.R Outdated Show resolved Hide resolved
R/teal_data.R Outdated Show resolved Hide resolved
R/teal_data.R Outdated Show resolved Hide resolved
R/JoinKeys.R Outdated Show resolved Hide resolved
#' @rdname teal_data-class
#'
#' @slot env (`environment`) environment containing data sets and possibly auxiliary variables
#' Access variables with [get_var()] or [`[[`].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#' Access variables with [get_var()] or [`[[`].
#' Access variables with [`get_var`] or [`[[`].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is wrong with [get_var()]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a function call, usually docs mention function names, and in code format.

#' Access with [get_warnings()].
#' @slot messages (`character`) messages raised when evaluating code.
#' @slot join_keys (`JoinKeys`) object specifying joining keys for data sets in `@env`.
#' Access or modify with [get_join_keys()].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#' Access or modify with [get_join_keys()].
#' Access or modify with [`get_join_keys`].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not according to roxygen2 docs.

image

See, in the rendered docs - Every [fun()] is rendered as fun() in the form of link. Topics have no () so it is better to distinguish what is a topic and what is a function.

Skärmavbild 2023-10-30 kl  10 39 19

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for using () in function names

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The distinction between topic and function seems unclear to me. Is a function NOT a topic? What is a topic then?

Here's my view:
The docs should read "Change datanames slot with datanames function."
One changes the slot with datanames<object> <- <new_datanames> not with datanames(). If you insist on the latter, you lose consistency with [[.

Look at help pages for environment, get, assign, etc. Hardly a foo() in sight.

Copy link
Contributor

@chlebowa chlebowa Oct 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plus I always thought that table is an example, not a directive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so a topic would be a help page and one can be dedicated to a concept or a function or a group of functions or anything in between. Correct?

#' Access or modify with [get_join_keys()].
#' @slot datanames (`character`) vector of names of data sets in `@env`.
#' Used internally to distinguish them from auxiliary variables.
#' Access or modify with [datanames()].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#' Access or modify with [datanames()].
#' Access or modify with [`datanames`].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, datanames is a topic

Copy link
Contributor

@chlebowa chlebowa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go!

@gogonzo gogonzo merged commit e14426f into main Oct 30, 2023
23 checks passed
@gogonzo gogonzo deleted the teal_data@main branch October 30, 2023 11:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[summary] data refactor
6 participants