Skip to content

Commit

Permalink
Finished basic vignette :)
Browse files Browse the repository at this point in the history
  • Loading branch information
jonathan-columbiau committed Feb 3, 2024
1 parent 5cfab42 commit 75c11e9
Show file tree
Hide file tree
Showing 3 changed files with 46 additions and 6 deletions.
4 changes: 2 additions & 2 deletions R/Classify.R
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ Classify <- function(bpcells_query, models, tree_struc, prop_max_threshold = .66
first_lev_avg_counts <- models[[node]][[i]]$avg_log_exp # scale
first_lev_std_counts <- models[[node]][[i]]$stdev
first_lev_markers <- models[[node]][[i]]$pc_loadings %>% colnames()
first_lev_bpcells <- query_cells[first_lev_markers, cells] %>% t() # select markers and cells of this internal node level
first_lev_bpcells <- query_cells[first_lev_markers, cells] %>% BPCells::t() # select markers and cells of this internal node level
first_lev_bpcells <- first_lev_bpcells %>%
BPCells::add_cols(-first_lev_avg_counts) %>%
BPCells::multiply_cols(1 / first_lev_std_counts)
Expand All @@ -68,7 +68,7 @@ Classify <- function(bpcells_query, models, tree_struc, prop_max_threshold = .66
dplyr::group_by(obs) %>%
dplyr::filter(count == max(count))
#filter obs with mult max classes
tied_obs <- obs_above_threshold %>% dplyr::group_by(obs) %>% dplyr::summarise(n = n()) %>% dplyr::filter(n > 1) %>% dplyr::pull(obs)
tied_obs <- obs_above_threshold %>% dplyr::group_by(obs) %>% dplyr::summarise(n = dplyr::n()) %>% dplyr::filter(n > 1) %>% dplyr::pull(obs)
obs_above_threshold <- obs_above_threshold %>%
dplyr::filter(!obs %in% tied_obs) %>%
dplyr::filter(count >= count_threshold) %>%
Expand Down
2 changes: 1 addition & 1 deletion man/CreateEqualTree.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

46 changes: 43 additions & 3 deletions vignettes/Basic-Tutorial.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,12 @@ using it just because it has accessible datasets with classifications
(stored in the Seurat metadata field!). Let us know if you have any
issues accessing the data needed for this tutorial.

```{r setup}
```{r}
library(lionmap)
library(SeuratData)
library(Seurat)
library(BPCells)
library(magrittr)
library(caret)
InstallData("pbmc3k")
UpdateSeuratObject(pbmc3k) #just Seurat updating things to get it to work
Expand Down Expand Up @@ -127,7 +128,7 @@ test_ge_bpcells = ge_bpcells[,test_ids]
#and of the GE matrix match, so a subset to one would give data corresponding to
#the same cells for the subset to the other
train_metadata = metadata[train_ids,]
test_ge_bpcells = metadata[test_ids,]
test_ge_metadata = metadata[test_ids,]
```

Expand Down Expand Up @@ -266,6 +267,45 @@ that didn't mean anything to you, don't worry, it's just something about
properly formatting the data). We'll use all these models in the next
step, to classify cells to specific celltypes.

To classify cells to previously defined celltypes, we'll use the
Classify function and will need to enter the following:

1. Our dataset with raw (not normalized/processed) GE values of cells
we want to classify (**bpcells_query**)

2. The list of models produced by GetModels (**models**)

3. The tree we created (**tree_struc**)

And we can optionally set the parameter prop_max_threshold, which allows
us to set the confidence threshold we need to have in order to classify
a cell.

The output is a vector giving the celltype classifications for each cell
in the query dataset.

```{r}
query_classifications = Classify(bpcells_query = test_)
query_classifications = Classify(bpcells_query = test_ge_bpcells,models = models,tree_struc = equal_tree)
```

Awesome! Now we have a vector of classifications, formatted so the name
of each element is the cell label and the value is the classification of
that cell. Let's view the first few elements.

```{r}
head(query_classifications)
```

We can view the number of classifications to each type using the table
function.

```{r}
table(query_classifications)
```

We'll see that one cell has the classification "Rootnode". This
indicates that there wasn't enough evidence to properly classify the
cell. If we used a custom hierarchy, classification occurs at each level
of the hierarchy, and stops at a given level of the tree if there's not
enough evidence to continue.

0 comments on commit 75c11e9

Please sign in to comment.