diff --git a/examples/10X_P50/README.md b/examples/10X_P50/README.md
index fa959b4..4fbe1fd 100644
--- a/examples/10X_P50/README.md
+++ b/examples/10X_P50/README.md
@@ -2,7 +2,8 @@
In this example, we will be analyzing a dataset of 5K adult mouse brain cells freely available from 10X. The raw data can be downloaded from [here](https://support.10xgenomics.com/single-cell-atac/datasets/1.1.0/atac_v1_adult_brain_fresh_5k).
-**Step 1. Download the data**.
+**Step 0. Download the data**.
+In this exampe, we will start from fragments.tsv.gz file created by cell-ranger ATAC.
```bash
$ wget http://cf.10xgenomics.com/samples/cell-atac/1.1.0/atac_v1_adult_brain_fresh_5k/atac_v1_adult_brain_fresh_5k_fragments.tsv.gz
@@ -45,16 +46,17 @@ CM - Total number of chrM fragments: 0
```
**Step 2. Create cell-by-bin matrix (snaptools)**
-Using snap file, we next create the cell-by-bin matrix. Snap file allows for storing cell-by-bin matrices of different resolutions. In the below example, as a demonstration, we create two cell-by-bin matrices with bin size of 5,000. But we find 5,000 is usually a good bin size, recommand to only generate cell-by-bin matrix of 5,000 in the future. (**Note that this does not create a new file, cell-by-bin matrix is stored in `atac_v1_adult_brain_fresh_5k.snap`**)
+Using snap file, we next create the cell-by-bin matrix. Snap file allows for storing cell-by-bin matrices of different resolutions. In the below example, as a demonstration, we create two cell-by-bin matrices with bin size of 1,000 and 5,000. But we find 5,000 is usually a good bin size, recommand to only generate cell-by-bin matrix of 5,000 in the future. (**Note that this does not create a new file, cell-by-bin matrix is stored in `atac_v1_adult_brain_fresh_5k.snap`**)
```bash
$ snaptools snap-add-bmat \
--snap-file=atac_v1_adult_brain_fresh_5k.snap \
- --bin-size-lis 5000 \
+ --bin-size-lis 1000 5000 \
--verbose=True
```
-**Step 3. Barcode selection (SnapATAC)**
+**Step 3. Barcode selection**
+We select high-quality barcodes based on two criteria: 1) number of filtered fragments; 2) fragments in promoter ratio (FRiP);
```R
> library(SnapATAC);
@@ -80,13 +82,13 @@ number of peaks: 0
-**Step 4. Bin size selection (SnapATAC)**
+**Step 4. Add cell-by-bin matrix to existing snap object**
Here we use cell-by-bin matrix of 5kb resolution as input for clustering. See [How to choose the bin size?](https://github.com/r3fang/SnapATAC/wiki/FAQs#bin_size)
```R
# show what bin sizes exist in atac_v1_adult_brain_fresh_5k.snap file
> showBinSizes("atac_v1_adult_brain_fresh_5k.snap");
-[1] 5000
+[1] 1000 5000
> x.sp = addBmatToSnap(x.sp, bin.size=5000, num.cores=1);
```
@@ -98,7 +100,7 @@ We next convert the cell-by-bin count matrix to a binary matrix. We found some i
```
**Step 6. Bin filtration (SnapATAC)**
-We next filter out any bins overlapping with the [ENCODE blacklist](http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/) and bins belonging to chrM or random chromsomes to prevent from any potential artifacts.
+We next filter out any bins overlapping with the [ENCODE blacklist](http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/) and bins belonging to unwanted chromsomes such as chrM, random chromsomes or sex chromsomes to prevent from any potential artifacts.
```R
> system("wget http://mitra.stanford.edu/kundaje/akundaje/release/blacklists/mm10-mouse/mm10.blacklist.bed.gz");
@@ -184,7 +186,6 @@ Using selected significant PCs, we next construct a K Nearest Neighbor (KNN) Gra
```
**Step 10. Clustering**
-
Next, we use leiden for clustering which allows for choosing different resolution resulting different clustering results. It requires R package `leiden` to be pre-installed [instruction](https://cran.r-project.org/web/packages/leiden/vignettes/run_leiden.html).
```R
@@ -219,7 +220,6 @@ SnapATAC visualize the datausing tSNE, UMAP and FIt-sne. In the following examp
```
**Step 12. Visulization**
-SnapATAC provides flexible visualization.
```R
> plotViz(
@@ -245,7 +245,7 @@ SnapATAC provides flexible visualization.
-**Step 12. Gene-body based annotation for expected cell types (SnapATAC)**
+**Step 13. Gene-body based annotation for expected cell types (SnapATAC)**
To help annotate identified cell clusters, SnapATAC next creates the cell-by-gene matrix and visualize the enrichment of marker genes.
```R
@@ -305,7 +305,7 @@ To help annotate identified cell clusters, SnapATAC next creates the cell-by-gen
-**Step 13. Heretical clustering of the clusters (SnapATAC)**
+**Step 14. Heretical clustering of the clusters (SnapATAC)**
```R
# calculate the ensemble signals for each cluster
@@ -319,7 +319,7 @@ To help annotate identified cell clusters, SnapATAC next creates the cell-by-gen
-**Step 16. Gene-body based annotation for excitatory neurons**
+**Step 15. Gene-body based annotation for excitatory neurons**
We next extracted the clusters belonging to excitatory neurons based on the gene accessibility level for Slc17a7 and plot layer-specific marker genes enrichment.
```R
@@ -372,7 +372,7 @@ We next extracted the clusters belonging to excitatory neurons based on the gene
-**Step 17. Identify cis-elements for each cluster seperately**
+**Step 16. Identify cis-elements for each cluster seperately**
This will create `nrrowPeak` and `.bedGraph` file that contains the peak and track for the given cluster. In the below example, SnapATAC creates `atac_v1_adult_brain_fresh_5k.1_peaks.narrowPeak` and `atac_v1_adult_brain_fresh_5k_treat_pileup.bdg`. `atac_v1_adult_brain_fresh_5k_treat_pileup.bdg` can later be converted to `bigWig` file for visulization using (`bedGraphToBigWig`)(https://anaconda.org/bioconda/ucsc-bedgraphtobigwig).
```R
@@ -397,7 +397,7 @@ After converting the `bedGraph` file to `bigWig` file, we next visulize the cell
-**Step 18. Create a cell-by-peak matrix**
+**Step 17. Create a cell-by-peak matrix**
Using merged peaks as a reference, we next create a cell-by-peak matrix using the original snap file.
```R
@@ -416,7 +416,7 @@ Using merged peaks as a reference, we next create a cell-by-peak matrix using th
```
-**Step 19. Identify Differentially Accessible Regions (DARs)**
+**Step 18. Identify Differentially Accessible Regions (DARs)**
SnapATAC can help find differentially accessible regions (DARs) that define clusters via differential analysis. By default, it identifes positive peaks of a single cluster (specified in `cluster.pos`), compared to a group of negative control cells.
```R