You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi I am running cpdb_statistical_analysis_method of cellphone db. My anndata shape is (6890, 2000) with following parameters:
cpdb_file_path = 'Resources/cellphonedb.zip'
meta_file_path = f'Data/{subsample}/CPDB_data/metadata.txt'
counts_file_path = f'Data/{subsample}/CPDB_data/counts.h5ad'
out_path = f'Data/{subsample}/CPDB_results/'
os.makedirs(out_path, exist_ok=True)
metadata = pd.read_csv(meta_file_path, sep = '\t')
cpdb_results = cpdb_statistical_analysis_method.call(
cpdb_file_path = cpdb_file_path, # mandatory: CellPhoneDB database zip file.
meta_file_path = meta_file_path, # mandatory: tsv file defining barcodes to cell label.
counts_file_path = counts_file_path, # mandatory: normalized count matrix.
counts_data = 'hgnc_symbol', # defines the gene annotation in counts matrix.
iterations = 1000, # denotes the number of shufflings performed in the analysis.
threshold = 0.1, # defines the min % of cells expressing a gene for this to be employed in the analysis.
threads = 40, # number of threads to use in the analysis.
debug_seed = 42, # debug randome seed. To disable >=0.
result_precision = 3, # Sets the rounding for the mean values in significan_means.
pvalue = 0.05, # P-value threshold to employ for significance.
subsampling = False, # To enable subsampling the data (geometri sketching).
subsampling_log = False, # (mandatory) enable subsampling log1p for non log-transformed data inputs.
subsampling_num_pc = 100, # Number of componets to subsample via geometric skectching (dafault: 100).
subsampling_num_cells = 1000, # Number of cells to subsample (integer) (default: 1/3 of the dataset).
separator = '|', # Sets the string to employ to separate cells in the results dataframes "cellA|CellB".
debug = False, # Saves all intermediate tables employed during the analysis in pkl format.
output_path = out_path, # Path to save results.
output_suffix = subsample # Replaces the timestamp in the output files by a user defined string in the (default: None).
)
I am getting the following error:
Reading user files...
The following user files were loaded successfully:
Data/Control4003/CPDB_data/counts.h5ad
Data/Control4003/CPDB_data/metadata.txt
[ ][CORE][23/03/24-20:24:37][INFO] [Cluster Statistical Analysis] Threshold:0.1 Iterations:1000 Debug-seed:42 Threads:40 Precision:3
[ ][CORE][23/03/24-20:24:37][WARNING] Debug random seed enabled. Set to 42
[ ][CORE][23/03/24-20:24:37][INFO] No CellphoneDB interactions found in this input.
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[6], line 11
8 os.makedirs(out_path, exist_ok=True)
9 metadata = pd.read_csv(meta_file_path, sep = '\t')
---> 11 cpdb_results = cpdb_statistical_analysis_method.call(
12 cpdb_file_path = cpdb_file_path, # mandatory: CellPhoneDB database zip file.
13 meta_file_path = meta_file_path, # mandatory: tsv file defining barcodes to cell label.
14 counts_file_path = counts_file_path, # mandatory: normalized count matrix.
15 counts_data = 'hgnc_symbol', # defines the gene annotation in counts matrix.
16 iterations = 1000, # denotes the number of shufflings performed in the analysis.
17 threshold = 0.1, # defines the min % of cells expressing a gene for this to be employed in the analysis.
18 threads = 40, # number of threads to use in the analysis.
19 debug_seed = 42, # debug randome seed. To disable >=0.
20 result_precision = 3, # Sets the rounding for the mean values in significan_means.
21 pvalue = 0.05, # P-value threshold to employ for significance.
22 subsampling = False, # To enable subsampling the data (geometri sketching).
23 subsampling_log = False, # (mandatory) enable subsampling log1p for non log-transformed data inputs.
24 subsampling_num_pc = 100, # Number of componets to subsample via geometric skectching (dafault: 100).
25 subsampling_num_cells = 1000, # Number of cells to subsample (integer) (default: 1/3 of the dataset).
26 separator = '|', # Sets the string to employ to separate cells in the results dataframes "cellA|CellB".
27 debug = False, # Saves all intermediate tables employed during the analysis in pkl format.
28 output_path = out_path, # Path to save results.
29 output_suffix = subsample # Replaces the timestamp in the output files by a user defined string in the (default: None).
30 )
File /gpfs/share/apps/anaconda3/gpu/5.2.0/envs/conda_tsirigoslab_transloc_env/lib/python3.8/site-packages/cellphonedb/src/core/methods/cpdb_statistical_analysis_method.py:148, in call(cpdb_file_path, meta_file_path, counts_file_path, counts_data, output_path, microenvs_file_path, active_tfs_file_path, iterations, threshold, threads, debug_seed, result_precision, pvalue, subsampling, subsampling_log, subsampling_num_pc, subsampling_num_cells, separator, debug, output_suffix, score_interactions)
124 counts = ss.subsample(counts)
126 analysis_result = cpdb_statistical_analysis_complex_method.call(meta.copy(),
127 counts,
128 counts_relations,
(...)
145 output_path
146 )
--> 148 significant_means = analysis_result['significant_means']
149 max_rank = significant_means['rank'].max()
150 significant_means['rank'] = significant_means['rank'].apply(lambda rank: rank if rank != 0 else (1 + max_rank))
KeyError: 'significant_means'
Can you please help as to what it means?
The text was updated successfully, but these errors were encountered:
To be able to debug the issue, could you send the input files you are using to [email protected]? If the files are too big to share via email, you can also send us the link to access them.
Sorry we couldn't help you since we haven't received your inputs. However, as mentioned in #186 with the same reported error, it's possible that your analysis ends up with finding no CellPhoneDB interactions, and this could be related that you might be using genes from a different organism, not human. If this is the case, you should convert the genes to their corresponding human orthologues. You can check details in our documentation: https://cellphonedb.readthedocs.io/en/latest/RESULTS-DOCUMENTATION.html#counts-file
Hi I am running cpdb_statistical_analysis_method of cellphone db. My anndata shape is (6890, 2000) with following parameters:
Can you please help as to what it means?
The text was updated successfully, but these errors were encountered: