Skip to content

Commit

Permalink
Split Machinery and Tabulation - Dev Guide (#614)
Browse files Browse the repository at this point in the history
* init

* init 2

* first example

* more code

* more

* modifications after discussions

* corrections from Gabe and rephrasing

* into inner

* temporary update

* going further

* some fixes

* update

* small fix

* recent update

* few mods

* most recent update

* finishing up

* comment updates

* more details

* constructing separate vignette for debugging

* adding to docs

* init

* init 2

* up

* update

* few things

* update

* up

* completed dev guide for tabulation

* update

* Move files to inst/dev-guide

* Update output format, render location

* kind of completing splits

* fix debugging

* Remove from pkgdown config - fix checks

* Split machinery - grammar fixes, adding links, applying styler

* Change default spaces in rtables project

* Tabulation - grammar fixes, rewording, adding links, applying styler

* Change word

* Debugging - grammar fixes

---------

Signed-off-by: Davide Garolini <[email protected]>
Co-authored-by: Emily de la Rua <[email protected]>
  • Loading branch information
Melkiades and edelarua authored Nov 1, 2023
1 parent afc1ddf commit 90da366
Show file tree
Hide file tree
Showing 4 changed files with 1,281 additions and 1 deletion.
91 changes: 91 additions & 0 deletions inst/dev-guide/dg_debug_rtables.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
---
title: "Debugging in `rtables` and Beyond"
author: "Davide Garolini"
date: '`r Sys.Date()`'
output:
html_document:
theme: spacelab
editor_options:
chunk_output_type: console
knit: (function(inputFile, encoding) {
rmarkdown::render(inputFile, encoding = encoding, output_dir = ".")})
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## Debugging

This is a short and non-comprehensive guide to debugging `rtables`. Regardless, it is to be considered valid for personal use at your discretion.

#### Coding in Practice

* It is easy to read and find problems
* It is not clever if it is impossible to debug

#### Some Definitions

* __Coding Error__ - Code does not do what you intended -> Bug in the punch card
* __Unexpected Input__ - Defensive programming FAIL FAST FAIL LOUD (FFFL) -> useful and not too time consuming
* __Bug in Dependency__ -> never use dependencies if you can!

#### Considerations About FFFL

Errors should be as close as possible to the source. For example, bad inputs should be found very early. The worst possible example is a software that is silently giving incorrect results. Common things that we can catch early are missing values, column `length == 0`, or `length > 1`.

#### General Suggestions

* Robust code base does not attempt doing possibly problematic operations.
* Read Error Messages
* `debugcall` you can add the signature (formals)
* `trace` is powerful because you can add the reaction
* `tracer` is very good and precise to find where it happens

`options(error = recover)` is one of the best tools to debug at it is a core tool when developing that allows you to step into any point of the function call sequence.

`dump.frames` and `debugger`: it saves it to a file or an object and then you call debugger to step in it
as you did recover.

#### `warn` Global Option

- `<0` ignored
- `0` top level function call
- `1` immediately as they occur
- `>=2` throws errors

`<<-` for `recover` or `debugger` gives it to the global environment

#### lo-fi debugging

* PRINT / CAT is always a low level debugging that can be used. It is helpful for server jobs where maybe only terminal or console output is available and no `browser()` can be used. For example, you can print the position or state of a function at a certain point untill you find the break point.
* comment blocks -> does not work with pipes (you can use `identity()` it is a step that does nothing but does not break the pipes)
* `browser()` bombing

#### Regression Tests

Almost every bug should become a regression test.

#### Debugging with Pipes

* Pipes are better to write code but horrible to debug
* T in pipe `%T>%` does print it midway
* `debug_pipe()` -> it is like the T pipe going into browser()

#### Shiny Debugging

More difficult due to reactivity.

#### General Suggestion

DO NOT BE CLEVER WITH CODE - ONLY IF YOU HAVE TO, CLEVER IS ALSO SUBJECTIVE AND IT WILL CHANGE WITH TIME.

## Debugging in `rtables`

We invite the smart developer to use the provided examples as a way to get an "interactive" and dynamic view of the internal algorithms as they are routinely executed when constructing tables with `rtables`. This is achieved by using `browser()` and `debugonce()` on internal and exported functions (`rtables:::` or `rtables::`), as we will see in a moment. We invite you to continuously and autonomously explore the multiple `S3` and `S4` objects that constitute the complexity and power of `rtables`. To do so, we will use the following functions:

* `methods(generic_function)`: This function lists the methods that are available for a generic function. Specifically for `S4` generic functions, `showMethods(generic_function)` gives more detailed information about each method (e.g. inheritance).
* `class(object)`: This function returns the class of an object. If the class is not one of the built-in classes in R, you can use this information to search for its documentation and examples. `help(class)` may be informative as it will call the documentation of the specific class. Similarly, the `?` operator will bring up the documentation page for different `S4` methods. For `S3` methods it is necessary to postfix the class name with a dot (e.g. `?summary.lm`).
* `getClass(class)`: This describes the type of class in a compact way, the slots that it has, and the relationships that it may have with the other classes that may inherit from or be inherited by it. With `getClass(object)` we can see to which values the slots of the object are assigned. It is possible to use `str(object, max.level = 2)` to see less formal and more compact descriptions of the slots, but it may be problematic when there are one or more objects in the class slots. Hence, the maximum number of levels should always be limited to 2 or 3 (`max.level = 2`). Similarly, `attributes()` can be used to retrieve some information, but we need to remember that storing important variables in this way is not encouraged. Information regarding the type of class can be retrieved with `mode()` and indirectly by `summary()` and `is.S4()`.
*`getAnywhere(function)` is very useful to get the source code of internal functions and specific generics. It works very well with `S3` methods, and will display the relevant namespace for each of the methods found. Similarly, `getMethod(S4_generic, S4_class)` can retrieve the source code of class-specific `S4` methods.
* `eval(debugcall(generic_function(obj)))`: this is a very useful way to browse a `S4` method, specifically for a defined object, without having to manually insert `browser()` into the code. It is also possible to do similarly with R > 3.4.0 where `debug*()` calls can have the triggering signature (class) specified. Both of these are modern and simplified wrappers of the tracing function `trace()`.
Loading

0 comments on commit 90da366

Please sign in to comment.