Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access medians per conditions #6

Open
JudithBernett opened this issue Aug 10, 2022 · 1 comment
Open

Access medians per conditions #6

JudithBernett opened this issue Aug 10, 2022 · 1 comment
Labels
good first issue Good for newcomers

Comments

@JudithBernett
Copy link
Collaborator

In case you want to access the exact median expressions per condition (or other stats), we have prepared a script for you. First, download your sce object using the button on the upper right. Then, modify your paths:

path_to_sce <- "~/Downloads/sce.rds"
path_to_output <- "~/Downloads/medians_per_condition.csv"
condition <- "condition"
sample_id <- "sample_id"

Install and load packages

install.packages("data.table")
install.packages("ggplot2")
if (!require("BiocManager", quietly = TRUE))
  install.packages("BiocManager")

BiocManager::install("SingleCellExperiment")

library(SingleCellExperiment)
library(data.table)
library(ggplot2)

Read in your SCE object

sce <- readRDS(path_to_sce)

Extract the expression data and add metadata information

exprs_dt <- data.table(t(assays(sce)$exprs))
exprs_dt[, (condition) := colData(sce)[, condition]]
exprs_dt[, (sample_id) := colData(sce)[, sample_id]]
# reformat for easier calculation
exprs_dt <- melt(exprs_dt, id.vars = c(condition, sample_id), variable.name = "marker", value.name = "exprs")

Calculate medians per condition

median_dt <- exprs_dt[, median(exprs), by=c(condition, "marker")]
colnames(median_dt) <- c(condition, "marker", "median")
print(median_dt)
# export
fwrite(median_dt, path_to_output)

Sanity check: reproduce boxplot from CYANUS

# calculate medians per condition per sample id
median_per_sample_dt <- exprs_dt[, median(exprs), by=c(condition, sample_id, "marker")]
colnames(median_per_sample_dt) <- c(condition, sample_id, "marker", "median")

ggplot()+
  geom_boxplot(data = median_per_sample_dt, aes(x = get(condition), y = median, color = get(condition)))+
  geom_point(data = median_dt, aes(x = get(condition), y = median))+
  facet_wrap(~marker, scales = "free_y")+
  theme_bw()

Attention: the median per marker per condition is not the same as the median over the medians per marker per condition per sample id. This is why the black points do not match the boxplot lines exactly.

@JudithBernett
Copy link
Collaborator Author

@quirinmanz quirinmanz added the good first issue Good for newcomers label Aug 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants