Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deGate() does not return a gate between the peaks found by getPeaks() when using twin.factor #8

Open
saheedb opened this issue Dec 27, 2022 · 1 comment

Comments

@saheedb
Copy link

saheedb commented Dec 27, 2022

I'm wondering if this is expected behavior. Example:

library(flowDensity)

set.seed(123)
dat <- matrix(c(rnorm(3000, mean = 5, sd = 0.5), 
                rnorm(3000, mean = 7, sd = 0.25),
                rnorm(1000, mean = 8, sd = 0.25)))
colnames(dat) = "d"
ff <- flowFrame(dat)

peaks <- getPeaks(ff, "d", twin.factor = 0.5)
gate <- deGate(ff, "d", twin.factor = 0.5)

peaks

The first peak is ignored as expected

$Peaks
[1] 7.013470 7.961586

$P.ind
[1] 325 400

$P.h
[1] 0.5534271 0.1867415

But the gate from deGate() is not between the two peaks from getPeaks() as if twin.factor did not have an effect

gate

[1] 6.153844

Plot

library(ggplot2)
library(ggcyto)

autoplot(ff, "d") + 
  geom_vline(xintercept = peaks$Peaks, color = "red") +
  geom_vline(xintercept = gate, color = "blue")

degate_getpeaks

Is this expected behavior? Thanks.

sessionInfo()

R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server 7.7 (Maipo)

Matrix products: default
BLAS:   /data/apps/R/4.0.3/lib64/R/lib/libRblas.so
LAPACK: /data/apps/R/4.0.3/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggcyto_1.18.0            flowWorkspace_4.5.3      ncdfFlow_2.36.0         
[4] BH_1.75.0-0              RcppArmadillo_0.10.1.2.0 ggplot2_3.3.3           
[7] flowDensity_1.24.0       flowCore_2.5.0          

loaded via a namespace (and not attached):
 [1] bitops_1.0-6            matrixStats_0.61.0-9001 RColorBrewer_1.1-2     
 [4] httr_1.4.2              Rgraphviz_2.34.0        Rwave_2.4-8            
 [7] RFOC_3.4-6              tools_4.0.3             R6_2.5.0               
[10] KernSmooth_2.23-17      rgeos_0.5-5             BiocGenerics_0.36.1    
[13] colorspace_2.0-0        withr_2.3.0             sp_1.4-4               
[16] gridExtra_2.3           tidyselect_1.1.0        splancs_2.01-40        
[19] curl_4.3.2              compiler_4.0.3          RPMG_2.2-3             
[22] graph_1.68.0            cli_2.2.0               Biobase_2.50.0         
[25] xml2_1.3.2              labeling_0.4.2          caTools_1.18.0         
[28] scales_1.1.1            hexbin_1.28.1           digest_0.6.28          
[31] foreign_0.8-80          rio_0.5.16              base64enc_0.1-3        
[34] jpeg_0.1-8.1            pkgconfig_2.0.3         maps_3.3.0             
[37] rlang_0.4.10            RSEIS_3.9-3             readxl_1.3.1           
[40] rstudioapi_0.13         farver_2.0.3            generics_0.1.0         
[43] gtools_3.8.2            dplyr_1.0.2             zip_2.1.1              
[46] car_3.0-10              magrittr_2.0.1          dotCall64_1.0-0        
[49] RProtoBufLib_2.5.1      Rcpp_1.0.5              munsell_0.5.0          
[52] S4Vectors_0.28.1        fansi_0.4.1             abind_1.4-5            
[55] lifecycle_0.2.0         GEOmap_2.4-4            stringi_1.5.3          
[58] carData_3.0-4           MASS_7.3-53             zlibbioc_1.36.0        
[61] plyr_1.8.6              gplots_3.1.1            grid_4.0.3             
[64] parallel_4.0.3          forcats_0.5.0           crayon_1.3.4           
[67] lattice_0.20-41         haven_2.3.1             hms_0.5.3              
[70] pillar_1.4.7            stats4_4.0.3            XML_3.99-0.5           
[73] glue_1.4.2              latticeExtra_0.6-29     data.table_1.13.6      
[76] MBA_0.0-9               RcppParallel_5.0.2      png_0.1-7              
[79] vctrs_0.3.6             spam_2.6-0              cellranger_1.1.0       
[82] gtable_0.3.0            aws.s3_0.3.21           purrr_0.3.4            
[85] assertthat_0.2.1        openxlsx_4.2.3          IDPmisc_1.1.20         
[88] tibble_3.0.4            cytolib_2.5.3           aws.signature_0.6.0    
[91] flowViz_1.54.0          fields_11.6             ellipsis_0.3.1         
@jmeskas
Copy link

jmeskas commented Sep 1, 2023

Hi @saheedb ,

I know it has been a while and maybe you already figured out your issue, but I saw post and figured I could help.

The twin.factor parameter is designed to disallow a valley result being returned if the valley's y value is very close to the y value of the peaks. This usually occurs when two peaks are about the same height and are next to each other. For example

dat <- matrix(c(rnorm(3000, mean = 7, sd = 0.4),
                rnorm(3000, mean = 8, sd = 0.4)))
colnames(dat) = "d"
ff <- flowFrame(dat)

gate1 <- deGate(ff, "d", twin.factor = 0.9)
gate2 <- deGate(ff, "d", twin.factor = 0.85)


library(ggplot2)
library(ggcyto)

autoplot(ff, "d") + 
  geom_vline(xintercept = peaks$Peaks, color = "red") +
  geom_vline(xintercept = gate1, color = "blue") + 
  geom_vline(xintercept = gate2, color = "green")

Screenshot from 2023-09-01 11-00-34

Please note that when twin.factor is reduced to 0.85 the valley is no longer accepted and a value related to the inflection point of the distribution is returned.

Regarding getPeaks, the twin.factor parameter is not commonly (possibly never) used for calculating peaks since we are looking for peaks and not valleys. And looking at the code, the part using twin.factor is implemented differently than the way it is implemented in deGate. I would advise not to use twin.factor in getPeaks. So your comment of "The first peak is ignored as expected", is actually not an expected result.

One tip is if you want to get both valleys in your example you can use the all.cuts parameter in deGate. For example,

library(flowDensity)
library(flowCore)

set.seed(123)
dat <- matrix(c(rnorm(3000, mean = 5, sd = 0.5), 
                rnorm(3000, mean = 7, sd = 0.25),
                rnorm(1000, mean = 8, sd = 0.25)))
colnames(dat) = "d"
ff <- flowFrame(dat)

peaks <- getPeaks(ff, "d", twin.factor = 0.5)
gates <- deGate(ff, "d", all.cuts=T)

gates
[1] 6.153844 7.670830

Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants