-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Short term change #94
base: devel
Are you sure you want to change the base?
Conversation
Signed-off-by: Daena Rys <[email protected]>
Signed-off-by: Daena Rys <[email protected]>
Signed-off-by: Daena Rys <[email protected]>
Signed-off-by: Daena Rys <[email protected]>
Signed-off-by: Daena Rys <[email protected]>
In holiday currently. I will check this early next week. |
Signed-off-by: Daena Rys <[email protected]>
Signed-off-by: Daena Rys <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is probably also not the most urgent ones
The calcultion part / core functionality is rather simple. The function should:
- Get data with mia::meltSE (transformations sould be applied in prior to this function)
- Calculate short term change based on
group
andtime.co
l - Return the data.frame
After this, user could plot the results if wanted
Good to finalize this as proposed. |
Example shows the following:
Is this mean to be an analysis or visualization function? The example could explain a bit more what this aims to show. I am not sure if I understand the motivation of this function from the manpage description. It would be good to have a separate function, like getShortTermChange / addShortTermChange and then a separate function for visualization (unless it is easy to do with existing functions, like miaViz::plotSeries). Is this linked to some miaTime issue? Issue #7 ? |
Signed-off-by: Daena Rys <[email protected]>
original reference had plotting as part of the function. It is now removed and the function returns a dataframe with short term changes
yes, syncomR consumed phyloseq object. |
Signed-off-by: Daena Rys <[email protected]>
Moreover given the nature of the data frame results, I am not sure plotSeries is the right visualization method for it. Suppose we somehow and of course we can also discuss how we want to represent our own plots. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks nice
There is no that kind of restriction that miaViz cannot have functions that take df as input. In fact, loadings plotting function can take df as input, This outputs currently data.frame and I think it makes sense. reducedDims is not correct because this does not reduce dimensionality. This could be stored to metadata and then we could have a plotting function for TreeSE and df. |
Or can it be part of colData? Could be simpler. This is a measure that is available on a per-sample basis. Except that some samples may have NAs if they are missing the comparison. |
That could be possible but then columns would denote features. It might be confusing and there might be lots of features to add in the columns |
Yep, not so optimal. |
"This function essentially calculates the change in abundance over time for a microbe by abundancet2/ abundancet1." We already have getStepwiseDivergence?
|
Signed-off-by: Daena Rys <[email protected]>
Signed-off-by: Daena Rys <[email protected]>
Signed-off-by: Daena Rys <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, some suggestions
# Reshape data and calculate growth metrics | ||
assay_data <- meltSE(x, assay.type = assay.type, | ||
add.col = time.col, row.name = "Feature_ID") | ||
assay_data <- assay_data %>% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use |> operators. They are used in elsewhere
#' @export | ||
setMethod("addShortTermChange", signature = c(x = "SummarizedExperiment"), | ||
function(x, assay.type = "counts", name = "short_term_change", ...){ | ||
# Calculate short term change | ||
res <- getShortTermChange(x, ...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check BiocHCeck::BiocHeck() at least indentation is off
#' \url{https://github.com/nwisnoski/ul-seedbank}. Their approach is based on | ||
#' the following calculation log(present abundance/past abundance). | ||
#' Also a compositional version using relative abundance similar to | ||
#' Brian WJi, Sheth R et al | ||
#' \url{https://www.nature.com/articles/s41564-020-0685-1} can be used. | ||
#' This approach is useful for identifying short term growth behaviors of taxa. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this indentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And also from above
#' | ||
#' short_time_labels <- c("74.5h", "173h", "438h", "434h", "390h") | ||
#' | ||
#' # Subset samples by Time_label and StudyIdentifier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of this comment, say what it is the goal: get specified time points from certain bioreactor
#' \code{\link[SummarizedExperiment:SummarizedExperiment-class]{SummarizedExperiment}} | ||
#' object with these results in its \code{metadata}. | ||
#' | ||
#' @details This approach is used by Wisnoski NI and colleagues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use math notation, it would be easier to read. Also the calculated values include other measurements than this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could describe them also
# Should still return a dataframe | ||
short_time_labels <- c("74.5h", "173h", "438h", "434h", "390h") | ||
# Subset samples by Time_label and StudyIdentifier | ||
tse_filtered <- tse[, !(tse$Time_label %in% short_time_labels)] | ||
tse_filtered <- tse_filtered[, (tse_filtered$StudyIdentifier == "Bioreactor A")] | ||
|
||
expect_true(all(!(tse_filtered$Time_label %in% short_time_labels))) | ||
|
||
result <- getShortTermChange(tse_filtered, time.col = "Time.hr") | ||
# Expected output is a dataframe | ||
expect_true(is.data.frame(result)) | ||
expect_true("growth_diff" %in% colnames(result)) | ||
# Test some expected properties (e.g., that growth_diff isn't all NAs) | ||
expect_false(all(is.na(result$growth_diff))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The values should be also checked
setMethod("getShortTermChange", signature = c(x = "SummarizedExperiment"), | ||
function(x, assay.type = "counts", ...){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the function should also take into account the grouping, i.e., from which patient/bioreactor the data is coming
arrange( !!sym(time.col) ) %>% | ||
group_by(Feature_ID) %>% | ||
mutate( | ||
time_lag = !!sym(time.col) - lag( !!sym(time.col) ), | ||
growth_diff =!!sym(assay.type) - lag(!!sym(assay.type)), | ||
growth_rate = (!!sym(assay.type) - lag(!!sym(assay.type))) / lag(!!sym(assay.type)), | ||
var_abund = (!!sym(assay.type) - lag(!!sym(assay.type))) / time_lag | ||
) | ||
return(assay_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should take into account duplicated values. I tink you could give warning first and then try something like this
df <- meltSE(tse, add.col = TRUE)
df |>
group_by(StudyIdentifier, FeatureID, Time.hr) |>
mutate(mean_abundance_for_timepoint = mean(counts, na.rm = TRUE)) |>
distinct(StudyIdentifier, FeatureID, Time.hr, .keep_all = TRUE) |>
ungroup() |>
arrange(StudyIdentifier, FeatureID, Time.hr) |>
group_by(StudyIdentifier, FeatureID) |>
mutate(growth_diff = mean_abundance_for_timepoint - lag(mean_abundance_for_timepoint)) |>
ungroup()
grs.all$Feature_IDabb <- toupper(abbreviate(grs.all$Feature_ID, | ||
minlength = 3, | ||
method = "both.sides")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These feature names come directly from rownames()? Then I think they should be kept unchanged
########################### Growth Metrics ############################ | ||
grwt <- .calculate_growth_metrics(x, assay.type = assay.type, ...) | ||
# Clean and format growth metrics | ||
grs.all <- .clean_growth_metrics(grwt, ...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if these are needed, as these are quite basic metric that can be easily calculated
Function to return short term change in a dataframe