Skip to content

Latest commit

 

History

History
15 lines (8 loc) · 713 Bytes

README.md

File metadata and controls

15 lines (8 loc) · 713 Bytes

Scripts are using RNA-seq data from the ARCHS4 database. The generated dataset is available here.

Download the following files necessary to generate the inputs and labels for the model to the ./data folder:

wget "https://s3.amazonaws.com/mssm-seq-matrix/human_matrix.h5"

wget --timestamping 'ftp://hgdownload.cse.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz' -O hg38.fa.gz

gunzip hg38.fa.gz

Run the main bash script for inputs, labels generation:

generate_inputsLabels_trainTest.sh