From 116fe6d06dc8d27f0ef91a511127f2206489a2c5 Mon Sep 17 00:00:00 2001
From: Robert Petryszak <info@datasome.co.uk>
Date: Wed, 3 May 2023 10:55:25 +0100
Subject: [PATCH 1/9] Re-instating latest changes by Luz

---
 README.md                     | 47 ++++++++++++++-----------
 docs/RESULTS-DOCUMENTATION.md | 66 +++++++++++++++++++----------------
 2 files changed, 62 insertions(+), 51 deletions(-)

diff --git a/README.md b/README.md
index f527fa3c..b28507ec 100755
--- a/README.md
+++ b/README.md
@@ -3,42 +3,49 @@
 
 ## What is CellPhoneDB?
 
-CellPhoneDB is a publicly available repository of curated receptors, ligands and their interactions in **HUMAN**. CellPhoneDB can be used to search for a particular ligand/receptor, or interrogate your own single-cell transcriptomics data (or even bulk transcriptomics data if your samples represent pure populations!). 
+CellPhoneDB is a publicly available repository of **HUMAN** curated receptors, ligands and their interactions paired with a tool to interrogate your own single-cell transcriptomics data (or even bulk transcriptomics data if your samples represent pure populations!). 
 
-Subunit architecture is included for both ligands and receptors, representing heteromeric complexes accurately. This is crucial, as cell communication relies on multi-subunit protein complexes that go beyond the binary representation used in most databases and studies. CellPhoneDB also incorporates biosynthetic pathways in which we use the last representative enzyme as a proxy of ligand abundance, by doing so, we include interactions involving non-peptidic CellPhoneDB includes only manually curated & reviewd molecular interactions with evidenced role in cellular communication.
+> A distictive feature of CellPhoneDB is that the subunit architecture of either ligands and receptors is taken into account, representing heteromeric complexes accurately. This is crucial, as cell communication relies on multi-subunit protein complexes that go beyond the binary representation used in most databases and studies. CellPhoneDB also incorporates biosynthetic pathways in which we use the last representative enzyme as a proxy of ligand abundance, by doing so, we include interactions involving non-peptidic CellPhoneDB includes only manually curated & reviewd molecular interactions with evidenced role in cellular communication.
 
-For more details on the analysis check the [DOCUMENTATION](https://cellphonedb.readthedocs.io/en/latest/#). 
+For more details on using CellPhoneDB for scRNA-seq data analysis, check the [DOCUMENTATION](https://cellphonedb.readthedocs.io/en/latest/#). 
 
-Please cite our papers [Vento-Tormo R, Efremova M, et al., 2018](https://www.nature.com/articles/s41596-020-0292-x) (original CellphoneDB) or [Garcia-Alonso et al., 2021](https://www.nature.com/articles/s41586-018-0698-6) (for CellphoneDB method 3).
 
-
-## New in CellPhoneDB-data v4.1.0
-
-This release of CellphoneDB database integrates new manually reviewed interactions with evidenced roles in cell-cell communication together with existing datasets that pertain to cellular communication (such as Shilts *et al.* 2022 and Kanemura *et al.* 2023). Recently, the database expanded to include non-protein molecules acting as ligands.
-
-1. CellPhoneDB has been implemented as a python package, improving its efficiency and adding new methods, such as the CellPhoneDB results query function.
-2. Manually curated more protein-protein interactions involved in cell-cell communication, with a special focus on proteins acting as heteromeric complexes [cellphonedb-data v4.1.0](https://github.com/ventolab/cellphonedb-data). The new database includes more than [2,900 high-confidence interactions](https://www.cellphonedb.org/database.html), including heteromeric complexes. In this version we haved added new G-protein-coupled receptors interactions from Kanemura *et al.* 2023 and  Shilts *et al.* 2022.
-3. Interactions retrieved from external resources have been removed from this release to include high-confidence interactions only.
-4. [Tutorials](notebooks) for the new CellPhoneDB implementation.
+## Novel features in v4
+1) New python package that can be easily executed in Jupyter Notebook and Collabs. 
+2) A new method to ease the query of CellPhoneDB results.
+3) Tutorials to run CellPhoneDB (available [here](https://github.com/ventolab/CellphoneDB/tree/master/notebooks))
+4) Improved computational efficiency of method 2 `cpdb_statistical_analysis_method`.
+5) A new database ([cellphonedb-data v4.1.0](https://github.com/ventolab/cellphonedb-data)) with more manually curated interactions, making up to a total of ~3,000 interactions. This release of CellphoneDB database has three main changes:
+    - Integrates new manually reviewed interactions with evidenced roles in cell-cell communication. 
+    - Includes non-protein molecules acting as ligands.
+    - CellphoneDB does not longer imports interactions from external resources. This is to avoid the inclusion of low-confidence interactions.
 
 See updates from [previous releases here](https://github.com/ventolab/CellphoneDB/blob/master/docs/RESULTS-DOCUMENTATION.md#release-notes).
 
 
 ## Installing CellPhoneDB 
-NOTE: Works with Python v3.8 or greater. If your default Python interpreter is for `v2.x` (you can check it with `python --version`), calls to `python`/`pip` should be substituted by `python3`/`pip3`.
 
 We highly recommend using an isolated python environment (as described in steps 1 and 2) using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html) or [virtualenv](https://docs.python.org/3/library/venv.html) but you could of course omit these steps and install via `pip` immediately.
 
 1. Create python=>3.8 environment
-- Using conda: `conda create -n cpdb python=3.8`
-- Using virtualenv: `python -m venv cpdb`
+   - Using conda: `conda create -n cpdb python=3.8`
+   - Using virtualenv: `python -m venv cpdb`
 
 2. Activate environment
-- Using conda: `source activate cpdb`
-- Using virtualenv: `source cpdb/bin/activate`
+   - Using conda: `source activate cpdb`
+   - Using virtualenv: `source cpdb/bin/activate`
 
 3. Install CellPhoneDB `pip install cellphonedb`
 
+4. Set up the kernel for the Jupyter notebooks.
+   - Install the ipython kernel: `pip install -U ipykernel`.
+   - Add the environment as a jupyter kernel: `python -m ipykernel install --user --name 'cpdb'`.
+   - Open/Start Jupyter and select the created kernel.
+
+5. Download the database.
+   - Follow this [tutorial](https://github.com/ventolab/CellphoneDB/blob/master/notebooks/T00_DownloadDB.ipynb).
+
+NOTE: Works with Python v3.8 or greater. If your default Python interpreter is for `v2.x` (you can check it with `python --version`), calls to `python`/`pip` should be substituted by `python3`/`pip3`.
 
 ## Running CellPhoneDB Methods
 
@@ -191,8 +198,8 @@ Currently CellPhoneDB relies on external plotting implementations to represent t
 
 Currently we recommend using tools such as: seaborn, ggplot or a more specific and tailored implementation as the ktplots:
 [@zktuong](https://github.com/zktuong):
-- [ktplots](https://www.github.com/zktuong/ktplots/) (R)
-- [ktplotspy](https://www.github.com/zktuong/ktplotspy/) (python)
+- [ktplots](https://www.github.com/zktuong/ktplots/) (R; preferred)
+- [ktplotspy](https://www.github.com/zktuong/ktplotspy/) (python; under development)
 
 
 ## Using different database versions
diff --git a/docs/RESULTS-DOCUMENTATION.md b/docs/RESULTS-DOCUMENTATION.md
index 5c815614..5362a9a8 100644
--- a/docs/RESULTS-DOCUMENTATION.md
+++ b/docs/RESULTS-DOCUMENTATION.md
@@ -20,15 +20,23 @@ CellPhoneDB tool provides different methods to assess cellular crosstalk between
 We highly recommend using an isolated python environment (as described in steps 1 and 2) using [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html) or [virtualenv](https://docs.python.org/3/library/venv.html) but you could of course omit these steps and install via `pip` immediately.
 
 1. Create python=>3.8 environment
-- Using conda: `conda create -n cpdb python=3.8`
-- Using virtualenv: `python -m venv cpdb`
+   - Using conda: `conda create -n cpdb python=3.8`
+   - Using virtualenv: `python -m venv cpdb`
 
 2. Activate environment
-- Using conda: `source activate cpdb`
-- Using virtualenv: `source cpdb/bin/activate`
+   - Using conda: `source activate cpdb`
+   - Using virtualenv: `source cpdb/bin/activate`
 
 3. Install CellPhoneDB `pip install cellphonedb`
 
+4. Set up the kernel for the Jupyter notebooks.
+   - Install the ipython kernel: `pip install -U ipykernel`.
+   - Add the environment as a jupyter kernel: `python -m ipykernel install --user --name 'cpdb'`.
+   - Open/Start Jupyter and select the created kernel.
+
+5. Download the database.
+   - Follow this [tutorial](https://github.com/ventolab/CellphoneDB/blob/master/notebooks/T00_DownloadDB.ipynb).
+
 > NOTE: Works with Python v3.8 or greater. If your default Python interpreter is for `v2.x` (you can check it with `python --version`), calls to `python`/`pip` should be substituted by `python3`/`pip3`.
 
 
@@ -247,7 +255,20 @@ CellphoneDB output is high-throughput. CellphoneDB provides all cell-cell intera
 
 It may be that not all of the cell-types of your input dataset co-appear in time and space. Cell types that do not co-appear in time and space will not interact. For example, you might have cells coming from different in vitro systems, different developmental stages or disease and control conditions. Use this prior information to restrict and ignore infeasible cell-type combinations from the outputs (i.e., columns) as well as their associated interactions (i.e. rows). You can restrict the analysis to feasible cell-type combinations using the option `microenvs`. Here you can input a two columns file indicating which cell type is in which spatiotemporal microenvironment (see [example](https://github.com/ventolab/CellphoneDB/blob/master/README.md#preparing-your-microenviroments-file-optional-if---microenvs) ). CellphoneDB will use this information to define possible pairs of interacting cells (i.e. pairs of clusters co-existing in a microenvironment) ignoring the rest of combinations.
 
-## Query CellPhoneDB results
+
+## Why values of clusterA-clusterB are different to the values of clusterB-clusterA?
+
+When __reading the outputs__, is IMPORTANT to note that the interactions are not symmetric. Partner A expression is considered for the first cluster/cell type (clusterA), and partner B expression is considered on the second cluster/cell type (clusterB). Thus, `IL12`-`IL12 receptor` for clusterA-clusterB (i.e. the receptor is in clusterB) is not the same that `IL12`-`IL12 receptor` for clusterB-clusterA (i.e. the receptor is in clusterA), and will have different values.
+
+In other words:
+* clusterA_clusterB = clusterA expressing partner A and clusterB expressing partner B.
+* clusterA_clusterB and clusterB_clusterA  values will be different.
+
+
+![CellphoneDB methods](./interpreting_results.png)
+
+
+## How can I query my CellPhoneDB results?
 
 CellPhoneDB results can be queried by making use of the `search_analysis_results` method. This method requires two of the files generated by CellPhoneDB: `significant_means` and `deconvoluted`.
 
@@ -268,32 +289,17 @@ search_results = search_utils.search_analysis_results(
 ```
 
 
-## Why values of clusterA-clusterB are different to the values of clusterB-clusterA?
-
-When __reading the outputs__, is IMPORTANT to note that the interactions are not symmetric. Partner A expression is considered for the first cluster/cell type (clusterA), and partner B expression is considered on the second cluster/cell type (clusterB). Thus, `IL12`-`IL12 receptor` for clusterA-clusterB (i.e. the receptor is in clusterB) is not the same that `IL12`-`IL12 receptor` for clusterB-clusterA (i.e. the receptor is in clusterA), and will have different values.
-
-In other words:
-* clusterA_clusterB = clusterA expressing partner A and clusterB expressing partner B.
-* clusterA_clusterB and clusterB_clusterA  values will be different.
-
-
-
-![CellphoneDB methods](./interpreting_results.png)
-
-
 
 DATABASE of interactions
 ============================================
 
-CellphoneDB has its own database of interactions called **CellphoneDB-data**, which can be found at https://github.com/ventolab/cellphonedb-data 
+CellphoneDB has its own database of interactions called **CellphoneDB-data**, which can be found at https://github.com/ventolab/cellphonedb-data. CellphoneDB database is a **manually curated** repository of receptors, ligands and their interactions.  
 
-CellphoneDB database (aka cellphonedb-data) is a **manually curated** repository of receptors, ligands and their interactions.  
 
-
-## Key features of CellphoneDB
-- Subunit architecture is included for both ligands and receptors, representing **heteromeric complexes** accurately. 
-This is crucial, as cell-cell communication relies on multi-subunit protein complexes that go beyond the binary representation used in most databases and studies. 
+#### Key features of CellphoneDB
+- Subunit architecture is included for both ligands and receptors, representing **heteromeric complexes** accurately. This is crucial, as cell-cell communication relies on multi-subunit protein complexes that go beyond the binary representation used in most databases and studies. 
 - Includes interactions involving **non-peptidic molecules** (i.e., not encoded by a gene) acting as ligands. Examples of these include steroid hormones (e.g., estrogen). To do so, we have reconstructed the biosynthetic pathways and used the last representative enzyme as a proxy of ligand abundance. We retrieve this information by manually reviewing and curating relevant literature and peer-reviewed pathway resources such as REACTOME. We include more than 200 interactions involving non-peptidic ligands!
+- Only includes HUMAN interactions. 
 
 
 ## Database design: input files
@@ -372,10 +378,8 @@ Currently CellPhoneDB relies on external plotting implementations to represent t
 
 We recommend using tools such as the ktplots:
 [@zktuong](https://github.com/zktuong):
-- [ktplots](https://www.github.com/zktuong/ktplots/) (R)
-- [ktplotspy](https://www.github.com/zktuong/ktplotspy/) (python)
-
-Or general tools such seaborn or ggplot.
+- [ktplots](https://www.github.com/zktuong/ktplots/) (R; preferred)
+- [ktplotspy](https://www.github.com/zktuong/ktplotspy/) (python; under development)
 
 
 Release notes
@@ -415,9 +419,8 @@ FAQs
 CellphoneDB accepts counts files in the following formats: as a text file (with columns indicating individual cells and rows indicating genes), as a h5ad (recommended), a h5 or a path to a folder containing a 10x output with mtx/barcode/features files.
 
 ### 2. How to extract the CellphoneDB input files from a Seurat object? 
-We recommend using normalised count data. This can be obtained by taking the raw data from the Seurat object and applying the normalisation manually. 
+We recommend using normalised count data. This can be obtained by taking the normalised slot from the Seurat object or by taking the raw data slot and applying the normalisation manually. The user can also normalise the data using their preferred method.
 
-The user can also normalise using their preferred method.
 
 ```R
 # R
@@ -447,9 +450,10 @@ You can provide an anndata as .h5ad file.
                    
                     
 ### 4. Should the input file with the count data be with HGNC symbols (gene names) or Ensembl IDs? 
+
 CellphoneDB.2 allows the use of both HGNC symbols and Ensembl IDs.
 
-Specify this with  `counts-data = hgnc_symbol`
+Please, specify HGNC symbols with  `counts-data = hgnc_symbol`.
 
 
 ### 5. What is the purpose of subsampling? 

From f95d2bea22b44a5aaf26532ac3d2afbdcb68df0c Mon Sep 17 00:00:00 2001
From: Prete <martin.prete@sanger.ac.uk>
Date: Wed, 3 May 2023 17:04:20 +0100
Subject: [PATCH 2/9] Update .gitignore

---
 .gitignore | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.gitignore b/.gitignore
index df265537..f63a5f63 100644
--- a/.gitignore
+++ b/.gitignore
@@ -4,3 +4,4 @@ cellphonedb/out
 __pycache__
 out
 docs/api/_build
+.DS_Store

From ec813c3aeb68ff46b8680eb44389b06f42116f30 Mon Sep 17 00:00:00 2001
From: Robert Petryszak <info@datasome.co.uk>
Date: Fri, 19 May 2023 15:59:39 +0100
Subject: [PATCH 3/9] Upgraded scikit-learn and geosketch to the latest version

---
 pyproject.toml   | 4 ++--
 requirements.txt | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/pyproject.toml b/pyproject.toml
index 6371590c..18500fdd 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -37,8 +37,8 @@ pandas = ">=1.5.0"
 numpy = ">=1.21.6"
 numpy-groupies = ">=0.9.15"
 requests = ">=2.25.0"
-scikit-learn = "==0.24"
-geosketch = "==1.2"
+scikit-learn = ">=1.2.2"
+geosketch = ">=1.2"
 anndata = ">=0.8"
 ktplotspy = ">=0.1.4"
 tqdm = ">=4.3,<5.0"
diff --git a/requirements.txt b/requirements.txt
index 54344362..cc5940c7 100755
--- a/requirements.txt
+++ b/requirements.txt
@@ -2,8 +2,8 @@ pandas>=1.5.0
 numpy>=1.21.6
 numpy-groupies>=0.9.15
 requests>=2.25
-scikit-learn==0.24
-geosketch==1.2
+scikit-learn>=1.2.2
+geosketch>=1.2
 anndata>=0.8
 ktplotspy>=0.1.4
 tqdm>=4.3,<5.0

From e266ac0b384879d187edb760b98a9d173f33eff8 Mon Sep 17 00:00:00 2001
From: Robert Petryszak <info@datasome.co.uk>
Date: Tue, 13 Jun 2023 10:10:06 +0100
Subject: [PATCH 4/9] Fixed a bug causing 'InvalidIndexError: Reindexing only
 valid with uniquely valued Index objects' error in all three analysis methods
 when counts_data = ensembl

---
 cellphonedb/src/core/methods/cpdb_analysis_method.py          | 4 ++--
 cellphonedb/src/core/methods/cpdb_degs_analysis_method.py     | 4 ++--
 .../core/methods/cpdb_statistical_analysis_complex_method.py  | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/cellphonedb/src/core/methods/cpdb_analysis_method.py b/cellphonedb/src/core/methods/cpdb_analysis_method.py
index fe8b9835..ba2b656a 100755
--- a/cellphonedb/src/core/methods/cpdb_analysis_method.py
+++ b/cellphonedb/src/core/methods/cpdb_analysis_method.py
@@ -219,10 +219,10 @@ def simple_complex_indicator(interaction: pd.Series, suffix: str) -> str:
         percent_analysis)
     significant_means = significant_means.round(result_precision)
 
-    gene_columns = ['{}_{}'.format(counts_data, suffix) for suffix in ('1', '2')]
+    gene_columns = ['{}_{}'.format('gene_name', suffix) for suffix in ('1', '2')]
     gene_renames = {column: 'gene_{}'.format(suffix) for column, suffix in zip(gene_columns, ['a', 'b'])}
 
-    # Remove useless columns
+    # Remove superfluous columns
     interactions_data_result = pd.DataFrame(
         interactions[['id_cp_interaction', 'partner_a', 'partner_b', 'receptor_1', 'receptor_2', *gene_columns,
                       'annotation_strategy']].copy())
diff --git a/cellphonedb/src/core/methods/cpdb_degs_analysis_method.py b/cellphonedb/src/core/methods/cpdb_degs_analysis_method.py
index dc8ef9dd..745ce748 100755
--- a/cellphonedb/src/core/methods/cpdb_degs_analysis_method.py
+++ b/cellphonedb/src/core/methods/cpdb_degs_analysis_method.py
@@ -255,10 +255,10 @@ def simple_complex_indicator(interaction: pd.Series, suffix: str) -> str:
         real_mean_analysis, relevant_interactions)
     significant_means = significant_means.round(result_precision)
 
-    gene_columns = ['{}_{}'.format(counts_data, suffix) for suffix in ('1', '2')]
+    gene_columns = ['{}_{}'.format('gene_name', suffix) for suffix in ('1', '2')]
     gene_renames = {column: 'gene_{}'.format(suffix) for column, suffix in zip(gene_columns, ['a', 'b'])}
 
-    # Remove useless columns
+    # Remove superfluous columns
     interactions_data_result = pd.DataFrame(
         interactions[['id_cp_interaction', 'partner_a', 'partner_b', 'receptor_1', 'receptor_2', *gene_columns,
                       'annotation_strategy']].copy())
diff --git a/cellphonedb/src/core/methods/cpdb_statistical_analysis_complex_method.py b/cellphonedb/src/core/methods/cpdb_statistical_analysis_complex_method.py
index 628fa6a3..bfa442d1 100755
--- a/cellphonedb/src/core/methods/cpdb_statistical_analysis_complex_method.py
+++ b/cellphonedb/src/core/methods/cpdb_statistical_analysis_complex_method.py
@@ -178,10 +178,10 @@ def simple_complex_indicator(interaction: pd.Series, suffix: str) -> str:
         real_mean_analysis, result_percent, pvalue)
     significant_means = significant_means.round(result_precision)
 
-    gene_columns = ['{}_{}'.format(counts_data, suffix) for suffix in ('1', '2')]
+    gene_columns = ['{}_{}'.format('gene_name', suffix) for suffix in ('1', '2')]
     gene_renames = {column: 'gene_{}'.format(suffix) for column, suffix in zip(gene_columns, ['a', 'b'])}
 
-    # Remove useless columns
+    # Remove superfluous columns
     interactions_data_result = pd.DataFrame(
         interactions[['id_cp_interaction', 'partner_a', 'partner_b', 'receptor_1', 'receptor_2', *gene_columns,
                       'annotation_strategy']].copy())

From 5760b74b74158d896bbec48f0d4767c5b4a447f3 Mon Sep 17 00:00:00 2001
From: ktroule <9655951+ktroule@users.noreply.github.com>
Date: Tue, 20 Jun 2023 09:05:12 +0100
Subject: [PATCH 5/9] Update T01_Method2_with_subsampling.ipynb

---
 notebooks/T01_Method2_with_subsampling.ipynb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/notebooks/T01_Method2_with_subsampling.ipynb b/notebooks/T01_Method2_with_subsampling.ipynb
index 43f2390a..6337b96a 100644
--- a/notebooks/T01_Method2_with_subsampling.ipynb
+++ b/notebooks/T01_Method2_with_subsampling.ipynb
@@ -479,7 +479,7 @@
     "- **receptor A/B**: True if the first interacting partner (A) or the second (B) is annotated as a receptor in our database.\n",
     "- **annotation_strategy**: Curated if the interaction was annotated by the CellPhoneDB developers. Other value if it was added by the user.\n",
     "- **is_integrin**: True if one of the partners is integrin.\n",
-    "- **cell_a|cell_b**: 1 if interaction is detected as significant, 0 if not."
+    "- **cell_a|cell_b**: p-value obtained by random shuffling."
    ]
   },
   {

From 6ff1ba28b2d1a895202c7be961e4798ab176fe69 Mon Sep 17 00:00:00 2001
From: ktroule <9655951+ktroule@users.noreply.github.com>
Date: Tue, 20 Jun 2023 09:08:08 +0100
Subject: [PATCH 6/9] Update T01_Method2.ipynb

---
 notebooks/T01_Method2.ipynb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/notebooks/T01_Method2.ipynb b/notebooks/T01_Method2.ipynb
index 32993a8e..7ad8f7db 100644
--- a/notebooks/T01_Method2.ipynb
+++ b/notebooks/T01_Method2.ipynb
@@ -462,7 +462,7 @@
     "- **receptor A/B**: True if the first interacting partner (A) or the second (B) is annotated as a receptor in our database.\n",
     "- **annotation_strategy**: Curated if the interaction was annotated by the CellPhoneDB developers. Other value if it was added by the user.\n",
     "- **is_integrin**: True if one of the partners is integrin.\n",
-    "- **cell_a|cell_b**: 1 if interaction is detected as significant, 0 if not."
+    "- **cell_a|cell_b**: p-value obtained by random shuffling."
    ]
   },
   {

From d80006eada35f10f2e71eed23c7dae7b990a5e40 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dar=C3=ADo=20Here=C3=B1=C3=BA?= <magallania@gmail.com>
Date: Wed, 12 Jul 2023 14:49:23 -0300
Subject: [PATCH 7/9] Minor typo fixes (lines 8 et alia)

---
 README.md | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/README.md b/README.md
index b28507ec..02d91729 100755
--- a/README.md
+++ b/README.md
@@ -5,7 +5,7 @@
 
 CellPhoneDB is a publicly available repository of **HUMAN** curated receptors, ligands and their interactions paired with a tool to interrogate your own single-cell transcriptomics data (or even bulk transcriptomics data if your samples represent pure populations!). 
 
-> A distictive feature of CellPhoneDB is that the subunit architecture of either ligands and receptors is taken into account, representing heteromeric complexes accurately. This is crucial, as cell communication relies on multi-subunit protein complexes that go beyond the binary representation used in most databases and studies. CellPhoneDB also incorporates biosynthetic pathways in which we use the last representative enzyme as a proxy of ligand abundance, by doing so, we include interactions involving non-peptidic CellPhoneDB includes only manually curated & reviewd molecular interactions with evidenced role in cellular communication.
+> A distinctive feature of CellPhoneDB is that the subunit architecture of either ligands and receptors is taken into account, representing heteromeric complexes accurately. This is crucial, as cell communication relies on multi-subunit protein complexes that go beyond the binary representation used in most databases and studies. CellPhoneDB also incorporates biosynthetic pathways in which we use the last representative enzyme as a proxy of ligand abundance, by doing so, we include interactions involving non-peptidic CellPhoneDB includes only manually curated & reviewd molecular interactions with evidenced role in cellular communication.
 
 For more details on using CellPhoneDB for scRNA-seq data analysis, check the [DOCUMENTATION](https://cellphonedb.readthedocs.io/en/latest/#). 
 
@@ -62,12 +62,12 @@ Counts file can be a text file or a `h5ad` (recommended), `h5` or a path to a fo
 #### Preparing your DEGs file (optional, if `method degs_analysis`)
 This is a two columns file indicanting which gene is specific or upregulated in a cell type (see [example](in/endometrium_atlas_example/endometrium_example_DEGs.tsv) ). The first column should be the cell type/cluster name (matching those in `meta.txt`) and the second column the associated gene id. The remaining columns are ignored. We provide [notebooks](notebooks) for both Seurat and Scanpy users. It is on you to design a DEG analysis appropiated for your research question. 
 
-#### Preparing your microenviroments file (optional, if `microenvs_file_path`)
+#### Preparing your microenvironments file (optional, if `microenvs_file_path`)
 This is a two columns file indicating which cell type is in which spatial microenvironment (see [example](in/endometrium_atlas_example/endometrium_example_microenviroments.tsv) ). CellphoneDB will use this information to define possible pairs of interacting cells (i.e. pairs of clusters co-appearing in a microenvironment). 
 
 ### RUN examples
 
-For more detailed examples refer to out tutorials [here](notebooks).
+For more detailed examples refer to our tutorials [here](notebooks).
 ####  Example with running the DEG-based method
 ```shell
 from cellphonedb.src.core.methods import cpdb_degs_analysis_method
@@ -119,7 +119,7 @@ means, deconvoluted = cpdb_analysis_method.call(
         output_path = out_path)
 ```
 
-####  Example running a microenviroments file
+####  Example running a microenvironments file
 ```shell
 from cellphonedb.src.core.methods import cpdb_analysis_method
 
@@ -132,7 +132,7 @@ means, deconvoluted = cpdb_analysis_method.call(
         output_path = out_path)
 ```
 
-####  Example running the DEG-based method with microenviroments file
+####  Example running the DEG-based method with microenvironments file
 ```shell
 from cellphonedb.src.core.methods import cpdb_degs_analysis_method
 
@@ -164,7 +164,7 @@ To understand the different analysis and results, please check the [results docu
 
 
 ~ **Optional Method Statistical parameters**
-- `microenvs_file_path`: Spatial microenviroments input file. Restricts the cluster/cell_type interacting pairs to the cluster/cell_type sharing a microenviroment (i.e. only test a combination of clusters if these coexist in a microenviroment). This file should contain two columns: 1st column indicates the cluster/cell_type, 2nd column indicates the microenviroment name.  See example [here](https://github.com/ventolab/CellphoneDB/tree/master/in). 
+- `microenvs_file_path`: Spatial microenvironments input file. Restricts the cluster/cell_type interacting pairs to the cluster/cell_type sharing a microenvironment (i.e. only test a combination of clusters if these coexist in a microenvironment). This file should contain two columns: 1st column indicates the cluster/cell_type, 2nd column indicates the microenvironment name.  See example [here](https://github.com/ventolab/CellphoneDB/tree/master/in). 
 - `pvalue`: P-value threshold [0.05]
 - `debug_seed`: Debug random seed -1. To disable it please use a value >=0 [-1]
 - `threads`: Number of threads to use. >=1 [4]

From 0147e3ccac1e18988749c7274d20b26d43b92d26 Mon Sep 17 00:00:00 2001
From: Robert Petryszak <info@datasome.co.uk>
Date: Thu, 17 Aug 2023 14:34:54 +0100
Subject: [PATCH 8/9] A fix for complexes that have a colon in their name - it
 was breaking the search in cellphonedb.org

---
 cellphonedb/utils/search_utils.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cellphonedb/utils/search_utils.py b/cellphonedb/utils/search_utils.py
index 68f5f93c..bc351a53 100644
--- a/cellphonedb/utils/search_utils.py
+++ b/cellphonedb/utils/search_utils.py
@@ -178,7 +178,7 @@ def get_html_table(data, complex_name2proteins, \
                 html += "<th style=\"text-align:left\">{}</th>".format(field)
             else:
                 if field.startswith(COMPLEX_PFX):
-                    name = field.split(":")[1]
+                    name = ":".join(field.split(":")[1:])
                     constituent_proteins = ', '.join(complex_name2proteins[name])
                     complex_mouseover = "Contains proteins: {}".format(constituent_proteins)
                     multi_protein_uniprot_url = get_uniprot_url(complex_name2proteins[name])

From 8f3deb4b6b0fec587058cbc34658a554e1afbf74 Mon Sep 17 00:00:00 2001
From: datasome <info@datasome.co.uk>
Date: Wed, 18 Oct 2023 10:34:02 +0100
Subject: [PATCH 9/9] Create python-app.yml

Prototyping GitHub action for CI testing and linting
---
 .github/workflows/python-app.yml | 40 ++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)
 create mode 100644 .github/workflows/python-app.yml

diff --git a/.github/workflows/python-app.yml b/.github/workflows/python-app.yml
new file mode 100644
index 00000000..ef229d98
--- /dev/null
+++ b/.github/workflows/python-app.yml
@@ -0,0 +1,40 @@
+# This workflow will install Python dependencies, run tests and lint with a single version of Python
+# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python
+
+name: Python application
+
+on:
+  push:
+    branches: [ "scoring" ]
+  pull_request:
+    branches: [ "scoring" ]
+
+permissions:
+  contents: read
+
+jobs:
+  build:
+
+    runs-on: ubuntu-latest
+
+    steps:
+    - uses: actions/checkout@v3
+    - name: Set up Python 3.8
+      uses: actions/setup-python@v3
+      with:
+        python-version: "3.8"
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install flake8 pytest
+        if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
+    - name: Lint with flake8
+      run: |
+        # stop the build if there are Python syntax errors or undefined names
+        flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
+        # exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
+        flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
+    - name: Test with pytest
+      run: |
+        cd cellphonedb/src/tests
+        pytest method_tests.py