Skip to content

Commit

Permalink
DESeq2 preprocessing (#372)
Browse files Browse the repository at this point in the history
* Get Rid of "MainFactor" inDEseq2 analysis

* Update Questionmark helper
  • Loading branch information
PaulJonasJost authored Nov 8, 2024
1 parent 464afd6 commit db2cd09
Show file tree
Hide file tree
Showing 4 changed files with 26 additions and 63 deletions.
1 change: 0 additions & 1 deletion program/shinyApp/R/pre_processing/ui.R
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ pre_processing_sidebar_panel <- sidebarPanel(
),
selected = "none"
) %>% helper(type = "markdown", content = "PreProcessing_Procedures"),
uiOutput(outputId = "DESeq_formula_main_ui"),
uiOutput(outputId = "DESeq_formula_sub_ui"),
uiOutput(outputId = "batch_effect_ui"),
actionButton(
Expand Down
33 changes: 10 additions & 23 deletions program/shinyApp/R/pre_processing/util.R
Original file line number Diff line number Diff line change
Expand Up @@ -94,39 +94,26 @@ ln_normalisation <- function(data, omic_type, logarithm_procedure){


deseq_processing <- function(
data, omic_type, formula_main, formula_sub, session_token, batch_correct
data, omic_type, formula_sub, session_token, batch_correct
){
# Center and scale the data
# prefilter the data
data <- prefiltering(data, omic_type)
# DESeq2
if(omic_type == "Transcriptomics"){
design_formula <- paste("~", formula_main)
# only do this locally
colData(data)[,formula_main] <- as.factor(
colData(data)[,formula_main]
)
if(length(formula_sub) > 0){
design_formula <- paste(
design_formula, " + ",
paste(formula_sub, collapse = " + ")
)
# turn each factor into a factor
for(i in formula_sub){
colData(data)[,i] <- as.factor(
colData(data)[,i]
)
}
par_tmp[[session_token]][["DESeq_factors"]] <<- c(
formula_main,formula_sub
if(length(formula_sub) <= 0){
stop(
"Please select at least one factor for the DESeq2 analysis.",
class = "InvalidInputError"
)
}
else{
par_tmp[[session_token]][["DESeq_factors"]] <<- c(formula_main)
design_formula <- paste("~", paste(formula_sub, collapse = " + "))
# turn each factor into a factor
for(i in formula_sub){
colData(data)[,i] <- as.factor(colData(data)[,i])
}
par_tmp[[session_token]][["DESeq_factors"]] <<- c(formula_sub)
print(design_formula)
# on purpose local
print(colData(data)[,formula_main])

dds <- DESeq2::DESeqDataSetFromMatrix(
countData = assay(data),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,25 +13,17 @@ A design matrix is a mathematical representation that describes how the experime

#### Choosing Factors for the Design Matrix

In DESeq2, you typically need to specify two types of factors:
In DESeq2, you typically need to specify factors that explain your data.

1. **Main Factor**: This is the primary variable of interest. For example, if you are interested in the effect of a treatment, the treatment condition would be the main factor.
2. **Other Factors**: These are additional variables that might affect the outcome but are not the primary focus of the study. These could include batch effects, sequencing depth, or other covariates.

#### How to Create the Design Formula

The design formula in DESeq2 is created by combining the main factor with other factors. The formula is typically written in the form:
The design formula in DESeq2 is created by combining the factors. The formula is typically written in the form:
```R
~ main_factor + other_factors
~ factor_1 + factor_2 + etc
```
This describes that the factors contribute to the data fitting independently from each
other.

Here’s a step-by-step guide on how the design matrix is formed from the selected factors:

1. **Select Main Factor**: Choose the primary experimental condition you are interested in.
2. **Select Other Factors**: Choose any additional factors that need to be accounted for in the analysis.
3. **Combine Factors**: Combine the main factor and other factors into a formula.

For example, if your main factor is `treatment` and you have two additional factors `batch` and `sequencing_depth`, the design formula would be:
For example, if your main factors are `treatment`, `batch` and `sequencing_depth`, the
design formula would be:

```R
~ treatment + batch + sequencing_depth
Expand All @@ -49,9 +41,8 @@ The design matrix allows DESeq2 to model the counts data while considering the s

Let's say you have an RNA-seq experiment with two conditions (Control and Treatment) and two additional factors (Batch and Sequencing Depth). You want to analyze the effect of the treatment while accounting for batch effects and sequencing depth. Here's how you can set it up:

1. **Main Factor**: Treatment
2. **Other Factors**: Batch, Sequencing Depth
3. **Design Formula**: `~ treatment + batch + sequencing_depth`
1. **Factors**: Treatment, Batch, Sequencing Depth
2. **Design Formula**: `~ treatment + batch + sequencing_depth`

The design matrix will help DESeq2 to:

Expand Down
28 changes: 7 additions & 21 deletions program/shinyApp/server.R
Original file line number Diff line number Diff line change
Expand Up @@ -993,7 +993,7 @@ server <- function(input,output,session){
length(unique(colData(res_tmp[[session$token]]$data_original)[[col]])) < nrow(colData(res_tmp[[session$token]]$data_original))
})]
if (input$PreProcessing_Procedure == "vst_DESeq") {
filtered_column_names <- filtered_column_names[!filtered_column_names %in% c(input$DESeq_formula_main, input$DESeq_formula_sub)]
filtered_column_names <- filtered_column_names[!filtered_column_names %in% c(input$DESeq_formula_sub)]
}
selectInput(
inputId = "BatchEffect_Column",
Expand All @@ -1002,33 +1002,19 @@ server <- function(input,output,session){
selected = "NULL"
)
})
output$DESeq_formula_main_ui <- renderUI({
req(data_input_shiny())
req(input$PreProcessing_Procedure == "vst_DESeq")
selectInput(
inputId = "DESeq_formula_main",
label = paste0(
"Choose main factor for desing formula in DESeq pipeline ",
"(App might crash if your factor as only 1 sample per level)"
),
choices = c(colnames(colData(res_tmp[[session$token]]$data))),
multiple = F,
selected = "condition"
) %>% helper(type = "markdown", content = "PreProcessing_DESeqMain")
})
output$DESeq_formula_sub_ui <- renderUI({
req(data_input_shiny())
req(input$PreProcessing_Procedure == "vst_DESeq")
selectInput(
inputId = "DESeq_formula_sub",
label = paste0(
"Choose other factors to account for",
"(App might crash if your factor as only 1 sample per level)"
"Choose factors to account for ",
"(App might crash if your factor has only 1 sample per level)"
),
choices = c(colnames(colData(res_tmp[[session$token]]$data))),
multiple = T,
selected = "condition"
) %>% helper(type = "markdown", content = "PreProcessing_DESeqSub")
) %>% helper(type = "markdown", content = "PreProcessing_DESeq")
})

## Do preprocessing ----
Expand Down Expand Up @@ -1070,7 +1056,6 @@ server <- function(input,output,session){
res_tmp[[session$token]]$data <<- deseq_processing(
data = res_tmp[[session$token]]$data,
omic_type = par_tmp[[session$token]]$omic_type,
formula_main = input$DESeq_formula_main,
formula_sub = input$DESeq_formula_sub,
session_token = session$token,
batch_correct = F
Expand Down Expand Up @@ -1106,7 +1091,6 @@ server <- function(input,output,session){
res_tmp[[session$token]]$data_batch_corrected <<- deseq_processing(
data = tmp_data_selected,
omic_type = par_tmp[[session$token]]$omic_type,
formula_main = input$DESeq_formula_main,
formula_sub = c(input$DESeq_formula_sub, input$BatchEffect_Column),
session_token = session$token,
batch_correct = T
Expand Down Expand Up @@ -1218,7 +1202,9 @@ server <- function(input,output,session){
message = paste0(
"**PreProcessing** - Preprocessing procedure -specific (user-chosen): ",
ifelse(input$PreProcessing_Procedure == "vst_DESeq",
paste0(input$PreProcessing_Procedure, "~",input$DESeq_formula_main),
paste0(
input$PreProcessing_Procedure,
" ~ ",paste(input$DESeq_formula_sub, collapse=" + ")),
input$PreProcessing_Procedure)
)
)
Expand Down

0 comments on commit db2cd09

Please sign in to comment.