SpaCCC: Large language model-based cell-cell communication inference for spatially resolved transcriptomic data
===========================================================================
Drawing parallels between linguistic constructs and cellular biology, large language models (LLMs) have achieved success in diverse downstream applications for single-cell data analysis. However, to date, it still lacks methods to take advantage of LLMs to infer ligand-receptor (LR)-mediated cell-cell communications for spatially resolved transcriptomic data. Here, we propose SpaCCC to facilitate the inference of spatially resolved cell-cell communications, which relies on our fine-tuned single-cell LLM and functional gene interaction network to embed ligand and receptor genes expressed in interacting individual cells into a unified latent space. The LR pairs with a significant closer distance in latent space are taken to be more likely to interact with each other. After that, the molecular diffusion and permutation test strategies are respectively employed to calculate the communication strength and filter out communications with low specificities. The benchmarked performance of SpaCCC is evaluated on real single-cell spatial transcriptomic datasets with superiority over other methods. SpaCCC also infers known LR pairs concealed by existing aggregative methods and then identifies communication patterns for specific cell types and their signaling pathways. Furthermore, SpaCCC provides various cell-cell communication visualization results at both single-cell and cell type resolution. In summary, SpaCCC provides a sophisticated and practical tool allowing researchers to decipher spatially resolved cell-cell communications and related communication patterns and signaling pathways based on spatial transcriptome data.
The overview of the workflow for developing SpaCCC. (a) The input, intermediate process for inferring spatially resolved cell-cell communications and output of SpaCCC. (b) The detailed process for determining the statistical significance of ligand-receptor pairs and calculating the cell-cell communication strength at single-cell and cell-type resolution. (c) Numerous visualization types of SpaCCC to decipher the spatially resolved cell-cell communications (see section 3.1 in the manuscript for details).
SpaCCC is tested to work under:
* Python 3.8.0
* Torch 1.12.0
* Scanpy 1.9.0
* Anndata 0.8.0
* R 4.2.2
* Numpy 1.23.5
* Other basic python and r toolkits
Notes: These dependencies, in order to obtain hidden embeddings of cells, ligands and receptors, are publicly available on their Github README.
- Install scGPT using
pip install scgpt "flash-attn<1.0.5"
if you encounter any issue. - Install Node2vec+ using
pip install pecanpy
if you encounter any issue. - Install LIANA+ using
pip install liana
if you encounter any issue.
To reproduce our results:
Notes: We provide readers with 4 sets of data as detailed in the following data descriptions. Note that in order to reduce the computational overhead and to make it easier for readers to reproduce our code, we will use the smaller test data in the following tutorials. Processing of other single-cell RNA-Seq data follows the same core pipeline as the test data. Due to the large size of the data, we uploaded them to the Google Drive.
File name | Description |
---|---|
10x_visium_human_renal_cell_carcinoma | The human renal cell carcinoma dataset was obtained from the GEO Series Accessions: GSE175540. You can also download the dataset from the STOmicsDB database (Dataset ID: STDS0000223), which is a comprehensive repository of literature and datasets related to spatial transcriptomics topics. This dataset contains 4510 cells of 5 cell types (Supplementary Fig. 1.a). |
10x_visium_human_breast_cancer | The human breast cancer dataset was obtained from the 10x Genomics website (Visium Demonstration, Human Breast Cancer, Block A Section 1). It contains 3798 cells of 9 cell types (Supplementary Fig. 1.b). |
MERFISH_human_ovarian_cancer_patient2_slice2 | The human ovarian cancer patient2 slice2 dataset was obtained from the vizgen website (https://console.cloud.google.com/storage/browser/vz-ffpe-showcase/HumanOvarianCancerPatient2Slice2). |
GeoMx_DSP_human_non_small_cell_lung_cancer | The human non-small cell lung cancer brain metastasis dataset was obtained from the GEO Series Accessions: GSE200563. |
Notes: If you already have the cell type labels in your dataset (e.g. adata.obs.cell_type in the h5ad file that we provide), skip this step.
# The following program needs to be run in the scGPT environment, see [scGPT](https://github.com/bowang-lab/scGPT) for details on how to use it:
python /home/jby2/SpaCCC/cell_type_anno_finetune.py --filename /home/jby2/SpaCCC/data/BRCA_Visium_10x_tmp.h5ad --dataset_name BRCA_Visium_10x_tmp --load_model /home/jby2/SpaCCC/scGPT_their_example/scGPT_human --save_dir /home/jby2/SpaCCC/results
Arguments:
Arguments | Detail |
---|---|
filename | The file path for single-cell RNA-seq data, requires h5ad file format |
dataset_name | Dataset name |
load_model | The folder path of pretained scGPT model (you can download it form link) |
save_dir | The folder path for saving the results (the directory will automatically be created). |
/home/jby2/anaconda3/envs/scgptt/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory
warn(f"Failed to load image Python extension: {e}")
Global seed set to 0
############ ------------- SpaCCC --------------- ############
>>> arguments <<<
Namespace(filename='/home/jby2/SpaCCC/data/BRCA_Visium_10x_tmp.h5ad', dataset_name='BRCA_Visium_10x_tmp', load_model='/home/jby2/SpaCCC/scGPT_their_example/scGPT_human', save_dir='/home/jby2/SpaCCC/results')
>>> loading hyperparameter and data <<< Sun May 19 11:17:01 2024
wandb: Currently logged in as: jby236. Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.16.4
wandb: Run data is saved locally in /home/jby2/data/wandb/run-20240519_111703-w8koygt1
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run bright-frog-221
wandb: ⭐️ View project at https://wandb.ai/jby236/scGPT
wandb: 🚀 View run at https://wandb.ai/jby236/scGPT/runs/w8koygt1
{'seed': 0, 'dataset_name': 'BRCA_Visium_10x_tmp', 'do_train': True, 'load_model': '/home/jby2/SpaCCC/scGPT_their_example/scGPT_human', 'mask_ratio': 0.0, 'epochs': 10, 'n_bins': 51, 'MVC': False, 'ecs_thres': 0.0, 'dab_weight': 0.0, 'lr': 0.0001, 'batch_size': 12, 'layer_size': 128, 'nlayers': 4, 'nhead': 4, 'dropout': 0.2, 'schedule_ratio': 0.9, 'save_eval_interval': 5, 'fast_transformer': True, 'pre_norm': False, 'amp': True, 'include_zero_gene': False, 'freeze': False, 'DSBN': False}
>>> settings for input and preprocessing <<< Sun May 19 11:17:14 2024
>>> input/output representation <<< Sun May 19 11:17:14 2024
>>> settings for training <<< Sun May 19 11:17:14 2024
save to /home/jby2/SpaCCC/results/dev_BRCA_Visium_10x_tmp-May19-11-17
>>> Load and pre-process data <<< Sun May 19 11:17:14 2024
scGPT - INFO - match 18097/22240 genes in vocabulary of size 60697.
scGPT - INFO - Resume model from /home/jby2/SpaCCC/scGPT_their_example/scGPT_human/best_model.pt, the model args will override the config /home/jby2/SpaCCC/scGPT_their_example/scGPT_human/args.json.
>>> set up the preprocessor, use the args to config the workflow <<< Sun May 19 11:17:14 2024
scGPT - INFO - Normalizing total counts ...
scGPT - INFO - Binning data ...
scGPT - INFO - Normalizing total counts ...
scGPT - INFO - Binning data ...
scGPT - INFO - train set number of samples: 2734,
feature length: 3001
scGPT - INFO - valid set number of samples: 304,
feature length: 3001
>>> Load the pre-trained scGPT model <<< Sun May 19 11:17:22 2024
scGPT - INFO - Loading params encoder.embedding.weight with shape torch.Size([60697, 512])
scGPT - INFO - Loading params encoder.enc_norm.weight with shape torch.Size([512])
scGPT - INFO - Loading params encoder.enc_norm.bias with shape torch.Size([512])
scGPT - INFO - Loading params value_encoder.linear1.weight with shape torch.Size([512, 1])
scGPT - INFO - Loading params value_encoder.linear1.bias with shape torch.Size([512])
scGPT - INFO - Loading params value_encoder.linear2.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params value_encoder.linear2.bias with shape torch.Size([512])
scGPT - INFO - Loading params value_encoder.norm.weight with shape torch.Size([512])
scGPT - INFO - Loading params value_encoder.norm.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.0.self_attn.Wqkv.weight with shape torch.Size([1536, 512])
scGPT - INFO - Loading params transformer_encoder.layers.0.self_attn.Wqkv.bias with shape torch.Size([1536])
scGPT - INFO - Loading params transformer_encoder.layers.0.self_attn.out_proj.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.0.self_attn.out_proj.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.0.linear1.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.0.linear1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.0.linear2.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.0.linear2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.0.norm1.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.0.norm1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.0.norm2.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.0.norm2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.1.self_attn.Wqkv.weight with shape torch.Size([1536, 512])
scGPT - INFO - Loading params transformer_encoder.layers.1.self_attn.Wqkv.bias with shape torch.Size([1536])
scGPT - INFO - Loading params transformer_encoder.layers.1.self_attn.out_proj.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.1.self_attn.out_proj.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.1.linear1.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.1.linear1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.1.linear2.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.1.linear2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.1.norm1.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.1.norm1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.1.norm2.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.1.norm2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.2.self_attn.Wqkv.weight with shape torch.Size([1536, 512])
scGPT - INFO - Loading params transformer_encoder.layers.2.self_attn.Wqkv.bias with shape torch.Size([1536])
scGPT - INFO - Loading params transformer_encoder.layers.2.self_attn.out_proj.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.2.self_attn.out_proj.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.2.linear1.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.2.linear1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.2.linear2.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.2.linear2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.2.norm1.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.2.norm1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.2.norm2.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.2.norm2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.3.self_attn.Wqkv.weight with shape torch.Size([1536, 512])
scGPT - INFO - Loading params transformer_encoder.layers.3.self_attn.Wqkv.bias with shape torch.Size([1536])
scGPT - INFO - Loading params transformer_encoder.layers.3.self_attn.out_proj.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.3.self_attn.out_proj.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.3.linear1.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.3.linear1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.3.linear2.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.3.linear2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.3.norm1.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.3.norm1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.3.norm2.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.3.norm2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.4.self_attn.Wqkv.weight with shape torch.Size([1536, 512])
scGPT - INFO - Loading params transformer_encoder.layers.4.self_attn.Wqkv.bias with shape torch.Size([1536])
scGPT - INFO - Loading params transformer_encoder.layers.4.self_attn.out_proj.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.4.self_attn.out_proj.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.4.linear1.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.4.linear1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.4.linear2.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.4.linear2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.4.norm1.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.4.norm1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.4.norm2.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.4.norm2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.5.self_attn.Wqkv.weight with shape torch.Size([1536, 512])
scGPT - INFO - Loading params transformer_encoder.layers.5.self_attn.Wqkv.bias with shape torch.Size([1536])
scGPT - INFO - Loading params transformer_encoder.layers.5.self_attn.out_proj.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.5.self_attn.out_proj.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.5.linear1.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.5.linear1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.5.linear2.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.5.linear2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.5.norm1.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.5.norm1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.5.norm2.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.5.norm2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.6.self_attn.Wqkv.weight with shape torch.Size([1536, 512])
scGPT - INFO - Loading params transformer_encoder.layers.6.self_attn.Wqkv.bias with shape torch.Size([1536])
scGPT - INFO - Loading params transformer_encoder.layers.6.self_attn.out_proj.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.6.self_attn.out_proj.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.6.linear1.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.6.linear1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.6.linear2.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.6.linear2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.6.norm1.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.6.norm1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.6.norm2.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.6.norm2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.7.self_attn.Wqkv.weight with shape torch.Size([1536, 512])
scGPT - INFO - Loading params transformer_encoder.layers.7.self_attn.Wqkv.bias with shape torch.Size([1536])
scGPT - INFO - Loading params transformer_encoder.layers.7.self_attn.out_proj.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.7.self_attn.out_proj.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.7.linear1.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.7.linear1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.7.linear2.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.7.linear2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.7.norm1.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.7.norm1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.7.norm2.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.7.norm2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.8.self_attn.Wqkv.weight with shape torch.Size([1536, 512])
scGPT - INFO - Loading params transformer_encoder.layers.8.self_attn.Wqkv.bias with shape torch.Size([1536])
scGPT - INFO - Loading params transformer_encoder.layers.8.self_attn.out_proj.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.8.self_attn.out_proj.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.8.linear1.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.8.linear1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.8.linear2.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.8.linear2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.8.norm1.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.8.norm1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.8.norm2.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.8.norm2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.9.self_attn.Wqkv.weight with shape torch.Size([1536, 512])
scGPT - INFO - Loading params transformer_encoder.layers.9.self_attn.Wqkv.bias with shape torch.Size([1536])
scGPT - INFO - Loading params transformer_encoder.layers.9.self_attn.out_proj.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.9.self_attn.out_proj.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.9.linear1.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.9.linear1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.9.linear2.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.9.linear2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.9.norm1.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.9.norm1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.9.norm2.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.9.norm2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.10.self_attn.Wqkv.weight with shape torch.Size([1536, 512])
scGPT - INFO - Loading params transformer_encoder.layers.10.self_attn.Wqkv.bias with shape torch.Size([1536])
scGPT - INFO - Loading params transformer_encoder.layers.10.self_attn.out_proj.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.10.self_attn.out_proj.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.10.linear1.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.10.linear1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.10.linear2.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.10.linear2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.10.norm1.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.10.norm1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.10.norm2.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.10.norm2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.11.self_attn.Wqkv.weight with shape torch.Size([1536, 512])
scGPT - INFO - Loading params transformer_encoder.layers.11.self_attn.Wqkv.bias with shape torch.Size([1536])
scGPT - INFO - Loading params transformer_encoder.layers.11.self_attn.out_proj.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.11.self_attn.out_proj.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.11.linear1.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.11.linear1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.11.linear2.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params transformer_encoder.layers.11.linear2.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.11.norm1.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.11.norm1.bias with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.11.norm2.weight with shape torch.Size([512])
scGPT - INFO - Loading params transformer_encoder.layers.11.norm2.bias with shape torch.Size([512])
scGPT - INFO - Loading params decoder.fc.0.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params decoder.fc.0.bias with shape torch.Size([512])
scGPT - INFO - Loading params decoder.fc.2.weight with shape torch.Size([512, 512])
scGPT - INFO - Loading params decoder.fc.2.bias with shape torch.Size([512])
scGPT - INFO - Loading params decoder.fc.4.weight with shape torch.Size([1, 512])
scGPT - INFO - Loading params decoder.fc.4.bias with shape torch.Size([1])
--------------------
name: encoder.embedding.weight
--------------------
name: encoder.enc_norm.weight
--------------------
name: encoder.enc_norm.bias
--------------------
name: value_encoder.linear1.weight
--------------------
name: value_encoder.linear1.bias
--------------------
name: value_encoder.linear2.weight
--------------------
name: value_encoder.linear2.bias
--------------------
name: value_encoder.norm.weight
--------------------
name: value_encoder.norm.bias
--------------------
name: transformer_encoder.layers.0.self_attn.Wqkv.weight
--------------------
name: transformer_encoder.layers.0.self_attn.Wqkv.bias
--------------------
name: transformer_encoder.layers.0.self_attn.out_proj.weight
--------------------
name: transformer_encoder.layers.0.self_attn.out_proj.bias
--------------------
name: transformer_encoder.layers.0.linear1.weight
--------------------
name: transformer_encoder.layers.0.linear1.bias
--------------------
name: transformer_encoder.layers.0.linear2.weight
--------------------
name: transformer_encoder.layers.0.linear2.bias
--------------------
name: transformer_encoder.layers.0.norm1.weight
--------------------
name: transformer_encoder.layers.0.norm1.bias
--------------------
name: transformer_encoder.layers.0.norm2.weight
--------------------
name: transformer_encoder.layers.0.norm2.bias
--------------------
name: transformer_encoder.layers.1.self_attn.Wqkv.weight
--------------------
name: transformer_encoder.layers.1.self_attn.Wqkv.bias
--------------------
name: transformer_encoder.layers.1.self_attn.out_proj.weight
--------------------
name: transformer_encoder.layers.1.self_attn.out_proj.bias
--------------------
name: transformer_encoder.layers.1.linear1.weight
--------------------
name: transformer_encoder.layers.1.linear1.bias
--------------------
name: transformer_encoder.layers.1.linear2.weight
--------------------
name: transformer_encoder.layers.1.linear2.bias
--------------------
name: transformer_encoder.layers.1.norm1.weight
--------------------
name: transformer_encoder.layers.1.norm1.bias
--------------------
name: transformer_encoder.layers.1.norm2.weight
--------------------
name: transformer_encoder.layers.1.norm2.bias
--------------------
name: transformer_encoder.layers.2.self_attn.Wqkv.weight
--------------------
name: transformer_encoder.layers.2.self_attn.Wqkv.bias
--------------------
name: transformer_encoder.layers.2.self_attn.out_proj.weight
--------------------
name: transformer_encoder.layers.2.self_attn.out_proj.bias
--------------------
name: transformer_encoder.layers.2.linear1.weight
--------------------
name: transformer_encoder.layers.2.linear1.bias
--------------------
name: transformer_encoder.layers.2.linear2.weight
--------------------
name: transformer_encoder.layers.2.linear2.bias
--------------------
name: transformer_encoder.layers.2.norm1.weight
--------------------
name: transformer_encoder.layers.2.norm1.bias
--------------------
name: transformer_encoder.layers.2.norm2.weight
--------------------
name: transformer_encoder.layers.2.norm2.bias
--------------------
name: transformer_encoder.layers.3.self_attn.Wqkv.weight
--------------------
name: transformer_encoder.layers.3.self_attn.Wqkv.bias
--------------------
name: transformer_encoder.layers.3.self_attn.out_proj.weight
--------------------
name: transformer_encoder.layers.3.self_attn.out_proj.bias
--------------------
name: transformer_encoder.layers.3.linear1.weight
--------------------
name: transformer_encoder.layers.3.linear1.bias
--------------------
name: transformer_encoder.layers.3.linear2.weight
--------------------
name: transformer_encoder.layers.3.linear2.bias
--------------------
name: transformer_encoder.layers.3.norm1.weight
--------------------
name: transformer_encoder.layers.3.norm1.bias
--------------------
name: transformer_encoder.layers.3.norm2.weight
--------------------
name: transformer_encoder.layers.3.norm2.bias
--------------------
name: transformer_encoder.layers.4.self_attn.Wqkv.weight
--------------------
name: transformer_encoder.layers.4.self_attn.Wqkv.bias
--------------------
name: transformer_encoder.layers.4.self_attn.out_proj.weight
--------------------
name: transformer_encoder.layers.4.self_attn.out_proj.bias
--------------------
name: transformer_encoder.layers.4.linear1.weight
--------------------
name: transformer_encoder.layers.4.linear1.bias
--------------------
name: transformer_encoder.layers.4.linear2.weight
--------------------
name: transformer_encoder.layers.4.linear2.bias
--------------------
name: transformer_encoder.layers.4.norm1.weight
--------------------
name: transformer_encoder.layers.4.norm1.bias
--------------------
name: transformer_encoder.layers.4.norm2.weight
--------------------
name: transformer_encoder.layers.4.norm2.bias
--------------------
name: transformer_encoder.layers.5.self_attn.Wqkv.weight
--------------------
name: transformer_encoder.layers.5.self_attn.Wqkv.bias
--------------------
name: transformer_encoder.layers.5.self_attn.out_proj.weight
--------------------
name: transformer_encoder.layers.5.self_attn.out_proj.bias
--------------------
name: transformer_encoder.layers.5.linear1.weight
--------------------
name: transformer_encoder.layers.5.linear1.bias
--------------------
name: transformer_encoder.layers.5.linear2.weight
--------------------
name: transformer_encoder.layers.5.linear2.bias
--------------------
name: transformer_encoder.layers.5.norm1.weight
--------------------
name: transformer_encoder.layers.5.norm1.bias
--------------------
name: transformer_encoder.layers.5.norm2.weight
--------------------
name: transformer_encoder.layers.5.norm2.bias
--------------------
name: transformer_encoder.layers.6.self_attn.Wqkv.weight
--------------------
name: transformer_encoder.layers.6.self_attn.Wqkv.bias
--------------------
name: transformer_encoder.layers.6.self_attn.out_proj.weight
--------------------
name: transformer_encoder.layers.6.self_attn.out_proj.bias
--------------------
name: transformer_encoder.layers.6.linear1.weight
--------------------
name: transformer_encoder.layers.6.linear1.bias
--------------------
name: transformer_encoder.layers.6.linear2.weight
--------------------
name: transformer_encoder.layers.6.linear2.bias
--------------------
name: transformer_encoder.layers.6.norm1.weight
--------------------
name: transformer_encoder.layers.6.norm1.bias
--------------------
name: transformer_encoder.layers.6.norm2.weight
--------------------
name: transformer_encoder.layers.6.norm2.bias
--------------------
name: transformer_encoder.layers.7.self_attn.Wqkv.weight
--------------------
name: transformer_encoder.layers.7.self_attn.Wqkv.bias
--------------------
name: transformer_encoder.layers.7.self_attn.out_proj.weight
--------------------
name: transformer_encoder.layers.7.self_attn.out_proj.bias
--------------------
name: transformer_encoder.layers.7.linear1.weight
--------------------
name: transformer_encoder.layers.7.linear1.bias
--------------------
name: transformer_encoder.layers.7.linear2.weight
--------------------
name: transformer_encoder.layers.7.linear2.bias
--------------------
name: transformer_encoder.layers.7.norm1.weight
--------------------
name: transformer_encoder.layers.7.norm1.bias
--------------------
name: transformer_encoder.layers.7.norm2.weight
--------------------
name: transformer_encoder.layers.7.norm2.bias
--------------------
name: transformer_encoder.layers.8.self_attn.Wqkv.weight
--------------------
name: transformer_encoder.layers.8.self_attn.Wqkv.bias
--------------------
name: transformer_encoder.layers.8.self_attn.out_proj.weight
--------------------
name: transformer_encoder.layers.8.self_attn.out_proj.bias
--------------------
name: transformer_encoder.layers.8.linear1.weight
--------------------
name: transformer_encoder.layers.8.linear1.bias
--------------------
name: transformer_encoder.layers.8.linear2.weight
--------------------
name: transformer_encoder.layers.8.linear2.bias
--------------------
name: transformer_encoder.layers.8.norm1.weight
--------------------
name: transformer_encoder.layers.8.norm1.bias
--------------------
name: transformer_encoder.layers.8.norm2.weight
--------------------
name: transformer_encoder.layers.8.norm2.bias
--------------------
name: transformer_encoder.layers.9.self_attn.Wqkv.weight
--------------------
name: transformer_encoder.layers.9.self_attn.Wqkv.bias
--------------------
name: transformer_encoder.layers.9.self_attn.out_proj.weight
--------------------
name: transformer_encoder.layers.9.self_attn.out_proj.bias
--------------------
name: transformer_encoder.layers.9.linear1.weight
--------------------
name: transformer_encoder.layers.9.linear1.bias
--------------------
name: transformer_encoder.layers.9.linear2.weight
--------------------
name: transformer_encoder.layers.9.linear2.bias
--------------------
name: transformer_encoder.layers.9.norm1.weight
--------------------
name: transformer_encoder.layers.9.norm1.bias
--------------------
name: transformer_encoder.layers.9.norm2.weight
--------------------
name: transformer_encoder.layers.9.norm2.bias
--------------------
name: transformer_encoder.layers.10.self_attn.Wqkv.weight
--------------------
name: transformer_encoder.layers.10.self_attn.Wqkv.bias
--------------------
name: transformer_encoder.layers.10.self_attn.out_proj.weight
--------------------
name: transformer_encoder.layers.10.self_attn.out_proj.bias
--------------------
name: transformer_encoder.layers.10.linear1.weight
--------------------
name: transformer_encoder.layers.10.linear1.bias
--------------------
name: transformer_encoder.layers.10.linear2.weight
--------------------
name: transformer_encoder.layers.10.linear2.bias
--------------------
name: transformer_encoder.layers.10.norm1.weight
--------------------
name: transformer_encoder.layers.10.norm1.bias
--------------------
name: transformer_encoder.layers.10.norm2.weight
--------------------
name: transformer_encoder.layers.10.norm2.bias
--------------------
name: transformer_encoder.layers.11.self_attn.Wqkv.weight
--------------------
name: transformer_encoder.layers.11.self_attn.Wqkv.bias
--------------------
name: transformer_encoder.layers.11.self_attn.out_proj.weight
--------------------
name: transformer_encoder.layers.11.self_attn.out_proj.bias
--------------------
name: transformer_encoder.layers.11.linear1.weight
--------------------
name: transformer_encoder.layers.11.linear1.bias
--------------------
name: transformer_encoder.layers.11.linear2.weight
--------------------
name: transformer_encoder.layers.11.linear2.bias
--------------------
name: transformer_encoder.layers.11.norm1.weight
--------------------
name: transformer_encoder.layers.11.norm1.bias
--------------------
name: transformer_encoder.layers.11.norm2.weight
--------------------
name: transformer_encoder.layers.11.norm2.bias
--------------------
name: decoder.fc.0.weight
--------------------
name: decoder.fc.0.bias
--------------------
name: decoder.fc.2.weight
--------------------
name: decoder.fc.2.bias
--------------------
name: decoder.fc.4.weight
--------------------
name: decoder.fc.4.bias
--------------------
name: cls_decoder._decoder.0.weight
--------------------
name: cls_decoder._decoder.0.bias
--------------------
name: cls_decoder._decoder.2.weight
--------------------
name: cls_decoder._decoder.2.bias
--------------------
name: cls_decoder._decoder.3.weight
--------------------
name: cls_decoder._decoder.3.bias
--------------------
name: cls_decoder._decoder.5.weight
--------------------
name: cls_decoder._decoder.5.bias
--------------------
name: cls_decoder.out_layer.weight
--------------------
name: cls_decoder.out_layer.bias
scGPT - INFO - Total Pre freeze Params 51336202
scGPT - INFO - Total Post freeze Params 51336202
>>> Finetune scGPT with task-specific objectives <<< Sun May 19 11:17:24 2024
random masking at epoch 1, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 1 | 100/228 batches | lr 0.0001 | ms/batch 368.21 | loss 1.59 | cls 1.59 | err 0.50 |
scGPT - INFO - | epoch 1 | 200/228 batches | lr 0.0001 | ms/batch 361.07 | loss 1.24 | cls 1.24 | err 0.40 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 1 | time: 85.86s | valid loss/mse 1.1115 | err 0.3553
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - Best model with score 1.1115
random masking at epoch 2, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 2 | 100/228 batches | lr 0.0001 | ms/batch 367.99 | loss 1.03 | cls 1.03 | err 0.35 |
scGPT - INFO - | epoch 2 | 200/228 batches | lr 0.0001 | ms/batch 364.89 | loss 0.88 | cls 0.88 | err 0.31 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 2 | time: 86.29s | valid loss/mse 0.7274 | err 0.2664
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - Best model with score 0.7274
random masking at epoch 3, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 3 | 100/228 batches | lr 0.0001 | ms/batch 370.11 | loss 0.81 | cls 0.81 | err 0.28 |
scGPT - INFO - | epoch 3 | 200/228 batches | lr 0.0001 | ms/batch 365.53 | loss 0.76 | cls 0.76 | err 0.27 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 3 | time: 86.58s | valid loss/mse 0.6702 | err 0.2632
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - Best model with score 0.6702
random masking at epoch 4, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 4 | 100/228 batches | lr 0.0001 | ms/batch 370.92 | loss 0.63 | cls 0.63 | err 0.22 |
scGPT - INFO - | epoch 4 | 200/228 batches | lr 0.0001 | ms/batch 366.96 | loss 0.66 | cls 0.66 | err 0.23 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 4 | time: 86.87s | valid loss/mse 0.5782 | err 0.2007
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - Best model with score 0.5782
random masking at epoch 5, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 5 | 100/228 batches | lr 0.0001 | ms/batch 374.61 | loss 0.57 | cls 0.57 | err 0.20 |
scGPT - INFO - | epoch 5 | 200/228 batches | lr 0.0001 | ms/batch 367.27 | loss 0.54 | cls 0.54 | err 0.18 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 5 | time: 87.29s | valid loss/mse 0.5129 | err 0.1875
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - Best model with score 0.5129
random masking at epoch 6, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 6 | 100/228 batches | lr 0.0001 | ms/batch 371.06 | loss 0.47 | cls 0.47 | err 0.17 |
scGPT - INFO - | epoch 6 | 200/228 batches | lr 0.0001 | ms/batch 366.60 | loss 0.44 | cls 0.44 | err 0.16 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 6 | time: 86.81s | valid loss/mse 0.5146 | err 0.1908
scGPT - INFO - -----------------------------------------------------------------------------------------
random masking at epoch 7, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 7 | 100/228 batches | lr 0.0001 | ms/batch 371.34 | loss 0.39 | cls 0.39 | err 0.14 |
scGPT - INFO - | epoch 7 | 200/228 batches | lr 0.0001 | ms/batch 366.52 | loss 0.35 | cls 0.35 | err 0.12 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 7 | time: 86.88s | valid loss/mse 0.6900 | err 0.2105
scGPT - INFO - -----------------------------------------------------------------------------------------
random masking at epoch 8, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 8 | 100/228 batches | lr 0.0000 | ms/batch 371.22 | loss 0.37 | cls 0.37 | err 0.13 |
scGPT - INFO - | epoch 8 | 200/228 batches | lr 0.0000 | ms/batch 366.34 | loss 0.30 | cls 0.30 | err 0.11 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 8 | time: 86.80s | valid loss/mse 0.7080 | err 0.2039
scGPT - INFO - -----------------------------------------------------------------------------------------
random masking at epoch 9, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 9 | 100/228 batches | lr 0.0000 | ms/batch 370.82 | loss 0.33 | cls 0.33 | err 0.11 |
scGPT - INFO - | epoch 9 | 200/228 batches | lr 0.0000 | ms/batch 367.14 | loss 0.30 | cls 0.30 | err 0.11 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 9 | time: 86.83s | valid loss/mse 0.7576 | err 0.1908
scGPT - INFO - -----------------------------------------------------------------------------------------
random masking at epoch 10, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 10 | 100/228 batches | lr 0.0000 | ms/batch 370.83 | loss 0.24 | cls 0.24 | err 0.08 |
scGPT - INFO - | epoch 10 | 200/228 batches | lr 0.0000 | ms/batch 365.80 | loss 0.26 | cls 0.26 | err 0.08 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 10 | time: 86.76s | valid loss/mse 0.9935 | err 0.1941
scGPT - INFO - -----------------------------------------------------------------------------------------
>>> Inference with fine-tuned scGPT model <<< Sun May 19 11:31:51 2024
scGPT - INFO - Accuracy: 0.820, Precision: 0.777, Recall: 0.676, Macro F1: 0.707
WARNING: saving figure to file /home/jby2/SpaCCC/results/dev_BRCA_Visium_10x_tmp-May19-11-17/showtest_cell_type_results.pdf
WARNING: saving figure to file /home/jby2/SpaCCC/results/dev_BRCA_Visium_10x_tmp-May19-11-17/showprediction_cell_type_results.pdf
WARNING: saving figure to file /home/jby2/SpaCCC/results/dev_BRCA_Visium_10x_tmp-May19-11-17/showcell_type_results.pdf
>>> Save the model into the save_dir <<< Sun May 19 11:32:04 2024
wandb: \ 0.282 MB of 0.282 MB uploaded
wandb: Run history:
wandb: epoch ▁▂▃▃▄▅▆▆▇██
wandb: info/post_freeze_param_count ▁
wandb: info/pre_freeze_param_count ▁
wandb: test/accuracy ▁
wandb: test/macro_f1 ▁
wandb: test/precision ▁
wandb: test/recall ▁
wandb: train/cls █▆██▅▄▅▃▅▅▄▄▄▅▂▅▄▂▂▃▃▃▂▂▂▅▄▄▄▂▂▁▂▂▂▄▃▁▁▁
wandb: valid/dab ▁▁▁▁▁▁▁▁▁▁▁
wandb: valid/err █▄▄▂▁▁▂▂▁▁▂
wandb: valid/mse █▄▃▂▁▁▃▃▄▇▂
wandb: valid/sum_mse_dab █▄▃▂▁▁▃▃▄▇▂
wandb:
wandb: Run summary:
wandb: epoch 10
wandb: info/post_freeze_param_count 51336202
wandb: info/pre_freeze_param_count 51336202
wandb: test/accuracy 0.82074
wandb: test/macro_f1 0.70775
wandb: test/precision 0.77764
wandb: test/recall 0.67672
wandb: train/cls 0.00378
wandb: valid/err 0.18526
wandb:
wandb: 🚀 View run bright-frog-221 at: https://wandb.ai/jby236/scGPT/runs/w8koygt1
wandb: Synced 6 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: ./wandb/run-20240519_111703-w8koygt1/logs
# The following program needs to be run in the scGPT environment, see [scGPT](https://github.com/bowang-lab/scGPT) for details on how to use it:
python /home/jby2/SpaCCC/fine_tuning_gene_embeddings.py --filename /home/jby2/SpaCCC/data/BRCA_Visium_10x_tmp.h5ad --dataset_name BRCA_Visium_10x_tmp --load_model /home/jby2/SpaCCC/scGPT_their_example/scGPT_human --save_dir /home/jby2/SpaCCC/results
python /home/jby2/SpaCCC/grn_get_embedding.py --filename /home/jby2/SpaCCC/data/BRCA_Visium_10x_tmp.h5ad --model_dir /home/jby2/SpaCCC/results/dev_BRCA_Visium_10x_tmp-May19-15-17
Arguments:
Arguments | Detail |
---|---|
filename | The file path for single-cell RNA-seq data, requires h5ad file format |
dataset_name | Dataset name |
load_model | The folder path of pretained scGPT model (you can download it form link) |
save_dir | The folder path for saving the results (the directory will automatically be created). |
model_dir | The folder path of fine tuned scGPT model. |
/home/jby2/anaconda3/envs/scgptt/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory
warn(f"Failed to load image Python extension: {e}")
Global seed set to 0
############ ------------- SpaCCC --------------- ############
>>> arguments <<<
Namespace(filename='/home/jby2/SpaCCC/data/BRCA_Visium_10x_tmp.h5ad', dataset_name='BRCA_Visium_10x_tmp', load_model='/home/jby2/SpaCCC/scGPT_their_example/scGPT_human', save_dir='/home/jby2/SpaCCC/results')
>>> loading hyperparameter and data <<< Sun May 19 15:17:11 2024
wandb: Currently logged in as: jby236. Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.16.4
wandb: Run data is saved locally in /home/jby2/data/wandb/run-20240519_151714-2plynghc
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run apricot-hill-224
wandb: ⭐️ View project at https://wandb.ai/jby236/scGPT
wandb: 🚀 View run at https://wandb.ai/jby236/scGPT/runs/2plynghc
{'seed': 0, 'dataset_name': 'BRCA_Visium_10x_tmp', 'do_train': True, 'load_model': '/home/jby2/scGPT_their_example/scGPT_human', 'mask_ratio': 0.0, 'epochs': 10, 'n_bins': 51, 'MVC': False, 'ecs_thres': 0.0, 'dab_weight': 0.0, 'lr': 0.0001, 'batch_size': 12, 'layer_size': 128, 'nlayers': 4, 'nhead': 4, 'dropout': 0.2, 'schedule_ratio': 0.9, 'save_eval_interval': 5, 'fast_transformer': True, 'pre_norm': False, 'amp': True, 'include_zero_gene': False, 'freeze': False, 'DSBN': False}
>>> settings for input and preprocessing <<< Sun May 19 15:17:26 2024
>>> input/output representation <<< Sun May 19 15:17:26 2024
>>> settings for training <<< Sun May 19 15:17:26 2024
save to /home/jby2/SpaCCC/results/dev_BRCA_Visium_10x_tmp-May19-15-17
>>> Load and pre-process data <<< Sun May 19 15:17:26 2024
scGPT - INFO - match 18097/22240 genes in vocabulary of size 60697.
scGPT - INFO - Resume model from /home/jby2/SpaCCC/scGPT_their_example/scGPT_human/best_model.pt, the model args will override the config /home/jby2/SpaCCC/scGPT_their_example/scGPT_human/args.json.
>>> set up the preprocessor, use the args to config the workflow <<< Sun May 19 15:17:26 2024
scGPT - INFO - Normalizing total counts ...
scGPT - INFO - Binning data ...
scGPT - INFO - Normalizing total counts ...
scGPT - INFO - Binning data ...
scGPT - INFO - train set number of samples: 2734,
feature length: 3001
scGPT - INFO - valid set number of samples: 304,
feature length: 3001
>>> Load the pre-trained scGPT model <<< Sun May 19 15:17:34 2024
scGPT - INFO - Loading params encoder.embedding.weight with shape torch.Size([60697, 512])
scGPT - INFO - Loading params encoder.enc_norm.weight with shape torch.Size([512])
...
name: cls_decoder._decoder.5.bias
--------------------
name: cls_decoder.out_layer.weight
--------------------
name: cls_decoder.out_layer.bias
scGPT - INFO - Total Pre freeze Params 51336202
scGPT - INFO - Total Post freeze Params 51336202
>>> Finetune scGPT with task-specific objectives <<< Sun May 19 15:17:36 2024
random masking at epoch 1, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 1 | 100/228 batches | lr 0.0001 | ms/batch 371.36 | loss 1.59 | cls 1.59 | err 0.50 |
scGPT - INFO - | epoch 1 | 200/228 batches | lr 0.0001 | ms/batch 363.57 | loss 1.24 | cls 1.24 | err 0.40 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 1 | time: 86.51s | valid loss/mse 1.1115 | err 0.3553
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - Best model with score 1.1115
random masking at epoch 2, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 2 | 100/228 batches | lr 0.0001 | ms/batch 370.45 | loss 1.03 | cls 1.03 | err 0.35 |
scGPT - INFO - | epoch 2 | 200/228 batches | lr 0.0001 | ms/batch 365.53 | loss 0.88 | cls 0.88 | err 0.31 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 2 | time: 86.66s | valid loss/mse 0.7274 | err 0.2664
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - Best model with score 0.7274
random masking at epoch 3, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 3 | 100/228 batches | lr 0.0001 | ms/batch 371.72 | loss 0.81 | cls 0.81 | err 0.28 |
scGPT - INFO - | epoch 3 | 200/228 batches | lr 0.0001 | ms/batch 366.72 | loss 0.76 | cls 0.76 | err 0.27 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 3 | time: 86.95s | valid loss/mse 0.6702 | err 0.2632
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - Best model with score 0.6702
random masking at epoch 4, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 4 | 100/228 batches | lr 0.0001 | ms/batch 372.03 | loss 0.63 | cls 0.63 | err 0.22 |
scGPT - INFO - | epoch 4 | 200/228 batches | lr 0.0001 | ms/batch 367.04 | loss 0.66 | cls 0.66 | err 0.23 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 4 | time: 87.02s | valid loss/mse 0.5782 | err 0.2007
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - Best model with score 0.5782
random masking at epoch 5, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 5 | 100/228 batches | lr 0.0001 | ms/batch 374.14 | loss 0.57 | cls 0.57 | err 0.20 |
scGPT - INFO - | epoch 5 | 200/228 batches | lr 0.0001 | ms/batch 367.74 | loss 0.54 | cls 0.54 | err 0.18 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 5 | time: 87.26s | valid loss/mse 0.5129 | err 0.1875
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - Best model with score 0.5129
random masking at epoch 6, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 6 | 100/228 batches | lr 0.0001 | ms/batch 372.35 | loss 0.47 | cls 0.47 | err 0.17 |
scGPT - INFO - | epoch 6 | 200/228 batches | lr 0.0001 | ms/batch 367.12 | loss 0.44 | cls 0.44 | err 0.16 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 6 | time: 87.06s | valid loss/mse 0.5146 | err 0.1908
scGPT - INFO - -----------------------------------------------------------------------------------------
random masking at epoch 7, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 7 | 100/228 batches | lr 0.0001 | ms/batch 372.07 | loss 0.39 | cls 0.39 | err 0.14 |
scGPT - INFO - | epoch 7 | 200/228 batches | lr 0.0001 | ms/batch 366.75 | loss 0.35 | cls 0.35 | err 0.12 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 7 | time: 86.95s | valid loss/mse 0.6900 | err 0.2105
scGPT - INFO - -----------------------------------------------------------------------------------------
random masking at epoch 8, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 8 | 100/228 batches | lr 0.0000 | ms/batch 371.68 | loss 0.37 | cls 0.37 | err 0.13 |
scGPT - INFO - | epoch 8 | 200/228 batches | lr 0.0000 | ms/batch 366.29 | loss 0.30 | cls 0.30 | err 0.11 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 8 | time: 86.85s | valid loss/mse 0.7080 | err 0.2039
scGPT - INFO - -----------------------------------------------------------------------------------------
random masking at epoch 9, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 9 | 100/228 batches | lr 0.0000 | ms/batch 371.61 | loss 0.33 | cls 0.33 | err 0.11 |
scGPT - INFO - | epoch 9 | 200/228 batches | lr 0.0000 | ms/batch 367.19 | loss 0.30 | cls 0.30 | err 0.11 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 9 | time: 86.92s | valid loss/mse 0.7576 | err 0.1908
scGPT - INFO - -----------------------------------------------------------------------------------------
random masking at epoch 10, ratio of masked values in train: 0.0000
scGPT - INFO - | epoch 10 | 100/228 batches | lr 0.0000 | ms/batch 371.57 | loss 0.24 | cls 0.24 | err 0.08 |
scGPT - INFO - | epoch 10 | 200/228 batches | lr 0.0000 | ms/batch 365.88 | loss 0.26 | cls 0.26 | err 0.08 |
scGPT - INFO - -----------------------------------------------------------------------------------------
scGPT - INFO - | end of epoch 10 | time: 86.82s | valid loss/mse 0.9935 | err 0.1941
scGPT - INFO - -----------------------------------------------------------------------------------------
>>> Inference with fine-tuned scGPT model <<< Sun May 19 15:32:05 2024
>>> Save the model into the save_dir <<< Sun May 19 15:32:18 2024
############ ------------- SpaCCC --------------- ############
>>> Load pre-trained model <<< Sun May 19 16:27:10 2024
>>> Retrieve model parameters from config files <<< Sun May 19 16:27:10 2024
Resume model from /home/jby2/SpaCCC/results/dev_BRCA_Visium_10x_tmp-May19-15-17/best_model.pt, the model args will override the config /home/jby2/SpaCCC/results/dev_BRCA_Visium_10x_tmp-May19-15-17/args.json.
Loading params encoder.embedding.weight with shape torch.Size([60697, 512])
Loading params encoder.enc_norm.weight with shape torch.Size([512])
...
Loading params cls_decoder.out_layer.weight with shape torch.Size([1, 512])
Loading params cls_decoder.out_layer.bias with shape torch.Size([1])
scGPT - INFO - Binning data ...
>>> Retrieve scGPT's gene embeddings <<< Sun May 19 16:27:23 2024
Retrieved gene embeddings for 17674 genes.
>>> Save gene embeddings to <<< Sun May 19 16:27:40 2024
/home/jby2/SpaCCC/results/dev_BRCA_Visium_10x_tmp-May19-15-17/all_gene_embedding.csv
# Protein links and information are first preprocessed
python /home/jby2/SpaCCC/preprocess_links_info.py --protein_links /home/jby2/SpaCCC/9606.protein.links.v12.0.txt --protein_info /home/jby2/SpaCCC/9606.protein.info.v12.0.txt --save_dir /home/jby2/SpaCCC/results
pecanpy --input "/home/jby2/SpaCCC/results/PPI.edg" --output /home/jby2/SpaCCC/results/PPI_n2vplus_epoch100.emb --mode SparseOTF --dimensions 512 --epochs 100 --workers 16 --p 0.05 --q 0.1 --weighted --verbose --gamma 1 --extend
python /home/jby2/SpaCCC/get_embedding_file.py --save_dir /home/jby2/SpaCCC/results
Arguments:
Arguments | Detail |
---|---|
protein_links | The file path for protein links. You can download it from the link |
protein_info | The file path for protein information. You can download it from the link |
save_dir | The folder path for saving the results (the directory will automatically be created). |
OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.
Took 00:02:45.09 to load Graph
Took 00:00:00.00 to pre-compute transition probabilities
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 196220/196220 [04:21<00:00, 751.23it/s]
Took 00:03:12.15 to generate walks
Took 00:03:24.95 to train embeddings
python /home/jby2/SpaCCC/embedding_null_test.py --embedding_file /home/jby2/results/PPI_embedding.csv --LR_file /home/jby2/data/LR.csv --LR_result /home/jby2/results/PPI_embedding_df_enriched_LR.csv
python /home/jby2/SpaCCC/embedding_null_test.py --embedding_file /home/jby2/results/scGPT_embedding.csv --LR_file /home/jby2/data/LR.csv --LR_result /home/jby2/results/scGPT_embedding_df_enriched_LR.csv
Arguments:
Arguments | Detail |
---|---|
embedding_file | The file path of ligand and receptor embeddings |
LR_file | The file path of ligand and receptor pairs. |
LR_result | The saved file path of significant ligand and receptor pairs. |
5,Calculation of the communication strength at single cell resolution and Filtering out ligand-receptor interactions with low specificities
python /home/jby2/SpaCCC/ce_tensor_filter.py --filename /home/jby2/data/BRCA_Visium_10x_tmp.h5ad --LR_result /home/jby2/results/df_enriched_LR.csv --save_dir /home/jby2/results
Arguments:
Arguments | Detail |
---|---|
filename | The file path for single-cell RNA-seq data, requires h5ad file format. |
LR_result | The saved file path of significant ligand and receptor pairs. |
save_dir | The folder path for saving the results (the directory will automatically be created).. |
===========================================================================
All authors were involved in the conceptualization of the SpaCCC method. LWX and SLP conceived and supervised the project. BYJ and LWX designed the study and developed the approach. BYJ and DBQ collected the data. BYJ and LWX analyzed the results. BYJ, DBQ, LWX and SLP contributed to the review of the manuscript before submission for publication. All authors read and approved the final manuscript.
If you have any questions or comments, please feel free to email: Shaoliang Peng ([email protected]); Boya Ji ([email protected]).