Skip to content

Latest commit

 

History

History

shuji_suzuki

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

1st Place Solution Summary

You can find the code on github here.

Multiome

Model Overview

Input Processing

Target Preprocessing

tSVD-based imputation method:

  1. Perform dimensionality reduction on the data with tSVD
  2. and then, Transform the data back to the original space
  3. copy the value of the 0 part of the original data from the transformed values.

Model

Output Postprocessing and Loss

In the inference phase, the model outputs the average of the five predicted target data.

CITEseq

Model Overview

Input Preprocessing

In selecting important genes in CITEseq, the correlation coefficient is calculated for each batch and select only genes with high correlation in many batches.

Genes were selected from those related to the target proteins and pathway.

I use Reactome as pathway database.

Target Preprocessing

Model

Output Postprocessing and Loss

In the inference phase, the model outputs the average of the five predicted target data.

Local evaluation

I used two evaluation schemes.

  1. Evaluation with cross validation:
    • 5-fold cross validation grouped by donor and day
  2. Evaluation for hyperparameter optimization with Optuna:
    • Training data set is divided into training and validation data sets. ( Training data set: 80%, validation data set: 20%. )

Ensemble

I used the weighted average of predictions of the following models.

  1. Models trained with changing the seed
  2. Models fine-tuned on only some batches
    • Batch combination pattern examples: males only, female only, Day 4, 7 only, etc.
    • Use a model trained on the full training data set as a pre-training model

Development setup

Download resources

res_dir=src/shuji_suzuki/resources
mkdir -p "$res_dir"
wget https://ftp.ebi.ac.uk/pub/databases/genenames/hgnc/tsv/hgnc_complete_set.txt -O "$res_dir/hgnc_complete_set.txt"
wget https://reactome.org/download/current/ReactomePathways.gmt.zip -O "$res_dir/ReactomePathways.gmt.zip" &&
  unzip "$res_dir/ReactomePathways.gmt.zip" -d "$res_dir" && 
  rm "$res_dir/ReactomePathways.gmt.zip"

Clone repo

echo shu65_openproblems > src/shuji_suzuki/.gitignore
git clone https://github.com/shu65/open-problems-multimodal.git src/shuji_suzuki/shu65_openproblems

Executing the method

Run method

viash run src/shuji_suzuki/config.vsh.yaml -- \
  --input sample_data \
  --output output \
  ---memory 100GB \
  ---cpus 30