Typos (theislab#237)

* first typo test commit * further typos * chapter 4 some typos * chapter 7 some typos * chapter 8 some typos * chapter 9 some typos * chapter 10 some typos * chapter 11 some typos * chapter 12 some typos * chapter 13 some typos * some typos in compositional analysis chapter * some typos in gsea * typos gsea pert domains neighborhood spatialvar * some typos spatial deconvolution * some typos in imputation, including corrected chapter numbering * Remove unfinished sentence --------- Co-authored-by: Eljas <[email protected]> Co-authored-by: zethson <[email protected]>
wisdomadingo · Sep 13, 2023 · a2f3eb8 · a2f3eb8
1 parent 076a9cf
commit a2f3eb8
Show file tree

Hide file tree

Showing 20 changed files with 166 additions and 180 deletions.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -1,7 +1,6 @@
 # Contributing
 
 We highly welcome community contributions and encourage contributions.
-There are several
 
 ## Book architecture
 

diff --git a/jupyter-book/cellular_structure/annotation.bib b/jupyter-book/cellular_structure/annotation.bib
@@ -60,7 +60,7 @@ @article{anno:Conde2022
 eprint = {https://www.science.org/doi/pdf/10.1126/science.abl5197},
 abstract = {Despite their crucial role in health and disease, our knowledge of immune cells within human tissues remains limited. We surveyed the immune compartment of 16 tissues from 12 adult donors by single-cell RNA sequencing and VDJ sequencing generating a dataset of ~360,000 cells. To systematically resolve immune cell heterogeneity across tissues, we developed CellTypist, a machine learning tool for rapid and precise cell type annotation. Using this approach, combined with detailed curation, we determined the tissue distribution of finely phenotyped immune cell types, revealing hitherto unappreciated tissue-specific features and clonal architecture of T and B cells. Our multitissue approach lays the foundation for identifying highly resolved immune cell types by leveraging a common reference dataset, tissue-integrated expression analysis, and antigen receptor sequencing. The human immune system is composed of many different cell types spread across the entire body, but little is currently known about the fine-grained variations in these cell types across organs. Using single-cell genomics, Domínguez Conde et al. examined the gene expression profile of more than 300,000 individual immune cells extracted from 16 different tissues in 12 deceased adult organ donors (see the Perspective by Liu and Zhang). Cell identity was assigned using CellTypist, an automated cell classification tool designed by the authors. In-depth data analysis revealed insights into how the immune system adapts to function effectively in different organ contexts. —LZ and DJ An immune cell atlas of human innate and adaptive immune cells across lymphoid, mucosal, and exocrine sites reveals tissue-specific compositions and features.}}
 
-@article {Pullin2022.05.09.490241,
+@article{anno:Pullin2022.05.09.490241,
 	author = {Pullin, Jeffrey M. and McCarthy, Davis J.},
 	title = {A comparison of marker gene selection methods for single-cell RNA sequencing data},
 	elocation-id = {2022.05.09.490241},
@@ -205,7 +205,7 @@ @article{anno:SHI20222234
 keywords = {multiple sclerosis, autoreactive T cells, bone marrow, myelopoiesis, neuroinflammation},
 }
 
-@ARTICLE{Wang1998-rx,
+@ARTICLE{anno:Wang1998-rx,
   title     = "The {TEL/ETV6} gene is required specifically for hematopoiesis
                in the bone marrow",
   author    = "Wang, L C and Swat, W and Fujiwara, Y and Davidson, L and
@@ -344,7 +344,7 @@ @Article{anno:Zhang2019
 url={https://doi.org/10.1038/s41592-019-0529-1}
 }
 
-@ARTICLE{Lopez2018-zc,
+@ARTICLE{anno:Lopez2018-zc,
   title     = "Deep generative modeling for single-cell transcriptomics",
   author    = "Lopez, Romain and Regier, Jeffrey and Cole, Michael B and
                Jordan, Michael I and Yosef, Nir",

diff --git a/jupyter-book/cellular_structure/annotation.ipynb b/jupyter-book/cellular_structure/annotation.ipynb
@@ -144,7 +144,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Load data:"
+    "## Load data"
    ]
   },
   {
@@ -329,7 +329,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "To start we store our raw counts in .layers['counts'], so that we will still have access to them later if needed. We then set our adata.X to the SCRAN-normalized, log-transformed counts."
+    "To start we store our raw counts in `.layers['counts']`, so that we will still have access to them later if needed. We then set our `adata.X` to the scran-normalized, log-transformed counts."
    ]
   },
   {
@@ -1179,7 +1179,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### Classifiers based on a wider set of genes. "
+    "### Classifiers based on a wider set of genes"
    ]
   },
   {
@@ -1794,7 +1794,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### Annotation by mapping to a reference."
+    "### Annotation by mapping to a reference"
    ]
   },
   {
@@ -2349,15 +2349,15 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "As you can see it has only 10 dimensions (in .X) which together represent the latent space embedding of the reference cells. Our query embedding that we calculated for our own data also has 10 dimensions. The 10 dimensions of the reference and query are the same and can be combined!<br>\n",
-    "Moreover, it has cell type labels in .obs['cell_type']. We will use these labels to annotate our own data."
+    "As you can see it has only 10 dimensions (in `.X`) which together represent the latent space embedding of the reference cells. Our query embedding that we calculated for our own data also has 10 dimensions. The 10 dimensions of the reference and query are the same and can be combined!<br>\n",
+    "Moreover, it has cell type labels in `.obs['cell_type']`. We will use these labels to annotate our own data."
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "To perform the label transfer, we will first concatenate the reference and query data using the 10-dimensional embedding. To get there, we will create the same type of AnnData object from our query data as we have from the reference (with the embedding under .X) and concatenate the two. With that, we can jointly analyze reference and query including doing transfer from one to the other."
+    "To perform the label transfer, we will first concatenate the reference and query data using the 10-dimensional embedding. To get there, we will create the same type of AnnData object from our query data as we have from the reference (with the embedding under `.X`) and concatenate the two. With that, we can jointly analyze reference and query including doing transfer from one to the other."
    ]
   },
   {
@@ -2528,7 +2528,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Let's perform the knn-based label transfer. "
+    "Let's perform the KNN-based label transfer. "
    ]
   },
   {
@@ -2972,7 +2972,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The uncertainty not only helps us identify regions where the algorithm is uncertain about which cell type a cell belongs to (e.g. because it falls in between two annotated phenotypes), but can also highlight unseen cell types or new cell states. For example, your reference might consist of heathly cells while your query could be from a diseased sample. The uncertainty score can then highlight disease-specific cell states, as they migh not have neighbors from the reference that consistently come from a single cell type. Especially when your reference is based on a large set of datasets, the uncertainty score is useful to flag parts of the query data that could be interesting to look into. Reference-based label transfer thus not only helps you annotate your data, but can also speed up exploration and interpretation of your data. However, like any metric, these uncertainty scores are often not perfect and in some cases fail to highlight new cell types or states. For a more extensive discussion of uncertainty metrics, see e.g. {cite}`anno:Engelmann2019`."
+    "The uncertainty not only helps us identify regions where the algorithm is uncertain about which cell type a cell belongs to (e.g. because it falls in between two annotated phenotypes), but can also highlight unseen cell types or new cell states. For example, your reference might consist of healthy cells while your query could be from a diseased sample. The uncertainty score can then highlight disease-specific cell states, as they might not have neighbors from the reference that consistently come from a single cell type. Especially when your reference is based on a large set of datasets, the uncertainty score is useful to flag parts of the query data that could be interesting to look into. Reference-based label transfer thus not only helps you annotate your data, but can also speed up exploration and interpretation of your data. However, like any metric, these uncertainty scores are often not perfect and in some cases fail to highlight new cell types or states. For a more extensive discussion of uncertainty metrics, see e.g. {cite}`anno:Engelmann2019`."
    ]
   },
   {

diff --git a/jupyter-book/cellular_structure/clustering.ipynb b/jupyter-book/cellular_structure/clustering.ipynb
diff --git a/jupyter-book/cellular_structure/integration.ipynb b/jupyter-book/cellular_structure/integration.ipynb
@@ -71,7 +71,7 @@
    "source": [
     "### Batch removal complexity\n",
     "\n",
-    "The removal of batch effects in scRNA-seq data has previously been divided into two subtasks: batch correction and data integration {cite}`Luecken2019-og`. These subtasks differ in the complexity of the batch effect that must be removed. Batch correction methods deal with batch effects between samples in the same experiment where cell identity compositions are consistent, and the effect is often quasi-linear. In contrast, data integration methods deal with complex, often nested, batch effects between datasets that may be generated with different protocols and where cell identities may not be shared across batches. While we use this distinction here we should not that these terms are often used interchangeably in general use. Given the differences in complexity, it is not surprising that different methods have been benchmarked as being optimal for these two subtasks."
+    "The removal of batch effects in scRNA-seq data has previously been divided into two subtasks: batch correction and data integration {cite}`Luecken2019-og`. These subtasks differ in the complexity of the batch effect that must be removed. Batch correction methods deal with batch effects between samples in the same experiment where cell identity compositions are consistent, and the effect is often quasi-linear. In contrast, data integration methods deal with complex, often nested, batch effects between datasets that may be generated with different protocols and where cell identities may not be shared across batches. While we use this distinction here we should note that these terms are often used interchangeably in general use. Given the differences in complexity, it is not surprising that different methods have been benchmarked as being optimal for these two subtasks."
    ]
   },
   {
@@ -572,14 +572,6 @@
     "```"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "id": "8ee0a84f-eb5e-4e0e-b30d-799c0db6d809",
-   "metadata": {},
-   "source": [
-    "## Unintegrated data"
-   ]
-  },
   {
    "cell_type": "markdown",
    "id": "e0528714-7701-4548-b1a7-c5d27bdde00f",
@@ -2234,7 +2226,7 @@
    "id": "a1a51a52",
    "metadata": {},
    "source": [
-    "The prepared AnnDAta is now available in R as a SingleCellExperiment object thanks to **anndata2ri**. Note that this is transposed compared to an AnnData object so our observations (cells) are now the columns and our variables (genes) are now the rows."
+    "The prepared AnnData is now available in R as a SingleCellExperiment object thanks to **anndata2ri**. Note that this is transposed compared to an AnnData object so our observations (cells) are now the columns and our variables (genes) are now the rows."
    ]
   },
   {