Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

init Archiver module #177

Closed
wants to merge 14 commits into from
Closed

init Archiver module #177

wants to merge 14 commits into from

Conversation

Polkas
Copy link
Contributor

@Polkas Polkas commented Dec 13, 2022

closes #81

Simply run the Previewer app from the vignette and test it yourself.

TODO

  • test with different bootstrap versions (3,4, and 5)
  • input validation
  • separate shiny app - a new vignette
  • load back, a way to validate the archiver zip
  • UI design
  • docs
  • test the module, testServer
  • Should we record the Archiver creator?
  • ...
teal_report_archiver_demo.mov

@lcd2yyz
Copy link

lcd2yyz commented Dec 14, 2022

Pretty cool feature, very nice work!

Question: When the JSON file is downloaded, what's the actual content? Is it just the code/setting or the actual visualization? Asking because what if the input datasets changed between the download/upload?
Comment 1: This feature make more sense to live in the Previewer tab, as it's restoring the contents to be shown in the Previewer.
Comment 2: It's not clear from just looking at the name/label "Archiver" what the functionality actually do. Perhaps should consider a more self-explanatory naming or adding description label to indicate it's saving/restoring the analysis cards?

@Polkas
Copy link
Contributor Author

Polkas commented Dec 14, 2022

TL;DR; There are many possible improvements to this feature.

  1. Question. That is a good comment. We already talked about the validation of datasets. Without additional protection, we will risk using different data in the same report (END USER responsibility). The idea of a dataset hash for each report card was proposed, which is automatic but adds complexity. Then Reporter (not card as then will be added once) (singleton instance - unique one for the whole session) requires an additional method like set_hash which will be optionally specified by the app developer when the reporter instance is created (the main unique element shared across the session). This hash have to be transferred to each Card when saved.
    I have an even more straightforward idea like Archiver ID_ARCHIVER (server function argument) set by the shiny app developer, and later each archiver zip file name will have this ID_ARCHIVER. It is simple but not perfect, as the responsibility is entirely on the developer of the shiny app (update the ID when data in the app changed). This functionality should be optional for developer UX when the END USER responsibility is enough for the assumptions. PS. the downloaded file is the zip file with one JSON file (Reporter representation) and static files for images and tables.
  2. This is for sure to be done. I already added it to the list too. However, for TEAL I recommend the save button on the direct level, as the save functionality should be at hand for the end users (We can think of a notification each 20 minutes to save the Archive). We can have four buttons on the level of each TEAL module, so then we will add only the save button to the three that already exist and change the title. In the general case, the functionality is modular so that somebody can build any setup.
  3. @insightsengineering/nest-core-dev, your naming ideas are wanted here. I appreciate your help. In the past, we name it Storager/Storage; currently, it is Archiver/Archive.

@Polkas
Copy link
Contributor Author

Polkas commented Dec 16, 2022

After some talks with @mhallal1 I want to leave a small comment.
@pawelru and the team decided we want teal.reporter tool to be as general as possible, and this way it was designed.
The priority is to make it working for any shiny app, NOT a teal one.
Because of its generality later can be easily implemented in the teal app.
This generality profits in bigger target group as can be used by the whole shiny community.
More than that code is more clear and easier to maintain, because of this generality.
@lcd2yyz for sure can change this strategy.

@Polkas
Copy link
Contributor Author

Polkas commented Dec 16, 2022

format name ?
do we need anything more than JSON?

@Polkas
Copy link
Contributor Author

Polkas commented Dec 16, 2022

I remove the archiver class after our discussions. We think the YAML implementation where we use yaml file instead of json is not needed. Moreover we see a little probability of implementing Data Base (e.g. remote MySQL) solution with login and password so end users do not have to manage the zip files. @nikolas-burkoff pointed a security issues for such too.

Main difference in the code is only in 2 functions.

Without Archiver:

#' @keywords internal
load_json_archiver <- function(reporter, zip_path, filename) {
  tmp_dir <- tempdir()
  output_dir <- file.path(tmp_dir, sprintf("archiver_load_%s", gsub("[.]", "", format(Sys.time(), "%Y%m%d%H%M%OS4"))))
  dir.create(path = output_dir)
  if (!is.null(zip_path) && grepl("archiver_", filename)) {
    tryCatch(
      expr = zip::unzip(zip_path, exdir = output_dir, junkpaths = TRUE),
 ...
    )
    reporter$from_jsondir(output_dir)
  } else {
    shiny::showNotification("Failed to load the archiver file.", type = "error")
  }
}

#' @keywords internal
archiver_download_handler_engine <- function(reporter, type, file) {
  switch(type,
    JSON = {
      tmp_dir <- tempdir()
      output_dir <- file.path(
        tmp_dir,
        sprintf("archiver_%s", gsub("[.]", "", format(Sys.time(), "%Y%m%d%H%M%OS4")))
      )
      dir.create(path = output_dir)
      archiver_dir <- reporter$to_jsondir(output_dir)
      temp_zip_file <- tempfile(fileext = ".zip")
      tryCatch(
        expr = zip::zipr(temp_zip_file, archiver_dir),
       ...
      )

      tryCatch(
        expr = file.copy(temp_zip_file, file),
        ...
      )
    }
  )
}

With Archiver:

#' @keywords internal
load_json_archiver <- function(reporter, zip_path, filename) {
  archiver <- JSONArchiver$new()
  tmp_dir <- tempdir()
  output_dir <- file.path(tmp_dir, sprintf("archive_%s", gsub("[.]", "", format(Sys.time(), "%Y%m%d%H%M%OS4"))))
  dir.create(output_dir)
  tryCatch(
    expr = zip::unzip(zip_path, exdir = output_dir, junkpaths = TRUE),
...
  )
  reporter_new <- archiver$read(output_dir)
  reporter$from_reporter(reporter_new)
}

#' @keywords internal
archiver_download_handler_engine <- function(reporter, type, file) {
  switch(type,
    JSON = {
      archiver_dir <- JSONArchiver$new()$write(reporter)
      temp_zip_file <- tempfile(fileext = ".zip")
      tryCatch(
        expr = zip::zipr(temp_zip_file, archiver_dir$get_output_dir()),
...
      )

      tryCatch(
        expr = file.copy(temp_zip_file, file),
...
      )
    }
  )
}

@Polkas
Copy link
Contributor Author

Polkas commented Dec 20, 2022

I can recall essential things now.

The from_jsondir and to_jsondir seems to be suited for Archiver (removed in this PR) class, not Reporter. There were two reasons It was decided this way. As the whole process assumes a singleton design for the Reporter, the method on the level of the Reporter makes it very easy to update it. On the other hand, when these two methods were in Archiver, we needed the additional from_reporter method, and the process was less optimal.
Another thing is that I was thinking of the Data Base solution based on a solution like AWS S3, then we could use the from_jsondir and to_jsondir methods for it too. We can send a directory directly to AWS S3 buckets. What class will manage such DB functionality? We need some additional encapsulation, then.

To sum up, the whole problem of perceiving this item is how we think about other solutions for saving the report/previewer. Why the current could be better as the end user could end up with hundreds of files in his local Downloads folder. More than that, the end user is responsible for managing the files himself.

@nikolas-burkoff I think you will be interested.

@gogonzo
Copy link
Contributor

gogonzo commented Apr 10, 2024

Closing along with #81

@gogonzo gogonzo closed this Apr 10, 2024
@insights-engineering-bot insights-engineering-bot deleted the 81_archiver@main branch April 14, 2024 03:58
gogonzo added a commit that referenced this pull request Apr 24, 2024
closes #81 
continuation of
#177
linked to insightsengineering/teal#1120 Please
install this teal branch when testing the code

I created a new PR from the fork as I am no longer part of the
insightengineering group.
My work is done as a collaboration of UCB company with Roche.
insightengineering developers can edit this PR.

I followed a simple design, which was evaluated positively in [the
discussion](#81).

DONE:

- New modules `report_load_srv` and `report_load_ui`; similar direct
update for Previewer.
- REMOVE the `Archiver` Class as we not need it for this simplified
scenario.
- Improve `to_list` and `from_list` `Reporter` methods
- Add `set_id` and `get_id` `Reporter` methods. Optionally add id to a
Report which will be compared when it is rebuilt from a list. To test it
in the teal example app please download a report and then add a new
module or dataset to the app and try to load it back. The report can be
loaded back to teal app only with the same datasets and modules. The id
is added to the downloaded file name if exists.
- Improve `to_list` and `from_list` `ReportCard` methods (linked with
insightsengineering/teal#1120)
- Both already existing vignette apps are updated automatically.
- `warning(cond)` everywhere to be consistent. We should send the
error/warning to the R console when STH fails.
- Add `testServer` tests for report_load_srv/report_load_ui modules.
- UI tested with all 3 bootstrap versions.

Points to consider:

- The JSON format Report representation seems to be enough, so an
Archiver is unnecessary. The DB solution to save/load seems overcomplex
for the project.
- No update will be required to introduce it into teal modules. Simple
reporter is updated automatically and can be customized with a new
teal.reporter option.
- When reloading, the Report is validated by the "report_" file name
prefix later by the slot name "teal Report" and optionally by ID if it
is non-empty.

Example Teal App (play with bootstrap versions, simple reporter modules,
and add new data/module to confirm the report can not be then reloaded):

```r
library(teal.modules.general)
# one of c("3", "4", "5")
options("teal.bs_theme" = bslib::bs_theme(version = "4"))

data <- teal_data()
data <- within(data, {
  library(nestcolor)
  ADSL <- teal.modules.general::rADSL
})
datanames <- c("ADSL")
datanames(data) <- datanames
join_keys(data) <- default_cdisc_join_keys[datanames]

app <- teal::init(
  data = data,
  modules = teal::modules(
    teal.modules.general::tm_a_regression(
      label = "Regression",
      response = teal.transform::data_extract_spec(
        dataname = "ADSL",
        select = teal.transform::select_spec(
          label = "Select variable:",
          choices = "BMRKR1",
          selected = "BMRKR1",
          multiple = FALSE,
          fixed = TRUE
        )
      ),
      regressor = teal.transform::data_extract_spec(
        dataname = "ADSL",
        select = teal.transform::select_spec(
          label = "Select variables:",
          choices = teal.transform::variable_choices(data[["ADSL"]], c("AGE", "SEX", "RACE")),
          selected = "AGE",
          multiple = TRUE,
          fixed = FALSE
        )
      ),
      ggplot2_args = teal.widgets::ggplot2_args(
        labs = list(subtitle = "Plot generated by Regression Module")
      )
    )
  )
)
runApp(app, launch.browser = TRUE)
```

Example general shiny app (play with bootstrap versions, simple reporter
modules):

```r
library(shiny)
library(teal.reporter)
library(ggplot2)
library(rtables)
library(DT)
library(bslib)

ui <- fluidPage(
    # please, specify specific bootstrap version and theme
    theme = bs_theme(version = "4"),
    titlePanel(""),
    tabsetPanel(
        tabPanel(
            "main App",
            tags$br(),
            sidebarLayout(
                sidebarPanel(
                    uiOutput("encoding")
                ),
                mainPanel(
                    tabsetPanel(
                        id = "tabs",
                        tabPanel("Plot", plotOutput("dist_plot")),
                        tabPanel("Table", verbatimTextOutput("table")),
                        tabPanel("Table DataFrame", verbatimTextOutput("table2")),
                        tabPanel("Table DataTable", dataTableOutput("table3"))
                    )
                )
            )
        ),
        ### REPORTER
        tabPanel(
            "Previewer",
            reporter_previewer_ui("prev")
        )
        ###
    )
)
server <- function(input, output, session) {
    output$encoding <- renderUI({
        tagList(
            ### REPORTER
            teal.reporter::simple_reporter_ui("simple_reporter"),
            ###
            if (input$tabs == "Plot") {
                sliderInput(
                    "binwidth",
                    "binwidth",
                    min = 2,
                    max = 10,
                    value = 8
                )
            } else if (input$tabs %in% c("Table", "Table DataFrame", "Table DataTable")) {
                selectInput(
                    "stat",
                    label = "Statistic",
                    choices = c("mean", "median", "sd"),
                    "mean"
                )
            } else {
                NULL
            }
        )
    })
    plot <- reactive({
        req(input$binwidth)
        x <- mtcars$mpg
        ggplot(data = mtcars, aes(x = mpg)) +
            geom_histogram(binwidth = input$binwidth)
    })
    output$dist_plot <- renderPlot(plot())
    
    table <- reactive({
        req(input$stat)
        lyt <- basic_table() %>%
            split_rows_by("Month", label_pos = "visible") %>%
            analyze("Ozone", afun = eval(str2expression(input$stat)))
        build_table(lyt, airquality)
    })
    output$table <- renderPrint(table())
    
    table2 <- reactive({
        req(input$stat)
        data <- aggregate(
            airquality[, c("Ozone"), drop = FALSE], list(Month = airquality$Month), get(input$stat),
            na.rm = TRUE
        )
        colnames(data) <- c("Month", input$stat)
        data
    })
    output$table2 <- renderPrint(print.data.frame(table2()))
    output$table3 <- renderDataTable(table2())
    
    ### REPORTER
    reporter <- Reporter$new()
    card_fun <- function(card = ReportCard$new(), comment) {
        if (input$tabs == "Plot") {
            card$set_name("Plot Module")
            card$append_text("My plot", "header2")
            card$append_plot(plot())
            card$append_rcode(
                paste(
                    c(
                        "x <- mtcars$mpg",
                        "ggplot2::ggplot(data = mtcars, ggplot2::aes(x = mpg)) +",
                        paste0("ggplot2::geom_histogram(binwidth = ", input$binwidth, ")")
                    ),
                    collapse = "\n"
                ),
                echo = TRUE,
                eval = FALSE
            )
        } else if (input$tabs == "Table") {
            card$set_name("Table Module rtables")
            card$append_text("My rtables", "header2")
            card$append_table(table())
            card$append_rcode(
                paste(
                    c(
                        "lyt <- rtables::basic_table() %>%",
                        'rtables::split_rows_by("Month", label_pos = "visible") %>%',
                        paste0('rtables::analyze("Ozone", afun = ', input$stat, ")"),
                        "rtables::build_table(lyt, airquality)"
                    ),
                    collapse = "\n"
                ),
                echo = TRUE,
                eval = FALSE
            )
        } else if (input$tabs %in% c("Table DataFrame", "Table DataTable")) {
            card$set_name("Table Module DF")
            card$append_text("My Table DF", "header2")
            card$append_table(table2())
            # Here r code added as a regular verbatim text
            card$append_text(
                paste0(
                    c(
                        'data <- aggregate(airquality[, c("Ozone"), drop = FALSE], list(Month = airquality$Month), ',
                        input$stat,
                        ", na.rm = TRUE)\n",
                        'colnames(data) <- c("Month", ', paste0('"', input$stat, '"'), ")\n",
                        "data"
                    ),
                    collapse = ""
                ), "verbatim"
            )
        }
        if (!comment == "") {
            card$append_text("Comment", "header3")
            card$append_text(comment)
        }
        card
    }
    teal.reporter::simple_reporter_srv("simple_reporter", reporter = reporter, card_fun = card_fun)
    teal.reporter::reporter_previewer_srv("prev", reporter)
    ###
}

if (interactive()) shinyApp(ui = ui, server = server)
```

---------

Signed-off-by: Maciej Nasinski <[email protected]>
Co-authored-by: Dawid Kałędkowski <[email protected]>
Co-authored-by: Dawid Kałędkowski <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create and implement a shiny module for Archiver class
4 participants