Skip to content

[Source] CNAPE: A Software for Copy Number Alteration Prediction from Gene Expression in Human Cancers

Notifications You must be signed in to change notification settings

WangLabHKUST/CNAPE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CNAPE

Copy Number Alteration Prediction from gene Expression in human cancers

Ownership

Wang Lab at HKUST

Status

Active development

Introduction

Copy number alterations (CNAs) are important features of human cancer. While the standard methods for CNA detection (CGH arrays, SNP arrrays, DNA sequencing) rely on DNA, occasionally DNA data are not available, especially in cancer studies (e.g. biopsies, legacy data). CNAPE comes into play by predicting CNAs based on gene expression data from RNA-seq.

How to run

1. Installation

Before installing CNAPE please make sure you have installed R, and Rscript is available in your system path ($PATH).

A simple clone of the repository is enough for installation, since the necessary packages will be installed automatically when you run CNAPE.

git clone https://github.com/WangLabHKUST/CNAPE

2. Preparing the input files

CNAPE.R takes the gene expression matrix of the human cancer samples as input. For RNA-seq data, you can process them using TCGA's RNA-seq processing pipeline (i.e., reads were aligned to the human genome using MapSplice and expression was quantified/normalized using RSEM against UCSC genes).

An example input file demonstrating the format of the input gene expression matrix can be found in the example/ folder.

3. Running CNAPE

The main function of CNAPE is packaged in cnape.R. Get your gene expression profile prepared, and run it like this:

Rscript cnape.R expressionMatrix outputPrefix

The output contains prefix.chromosome_level.cna.txt and prefix.arm_level.cna.txt, where 1 means amplified, -1 means deleted, while 0 means no CNA change.

4. Examples

Large-scale CNAs

For chromosome and arm level CNAs, the models trained on TCGA pan-cancer data are available. After you have cloned CNAPE, please go to the CNAPE folder and run :

./run_example.sh

Your result files, named example.chromosome_level.cna.txt and example.arm_level.cna.txt, should appear in the example folder. You can compare the results with the provided example.chromosome_level.cna.origional.txt and example.arm_level.cna.origional.txt.

Gene-level CNAs

A more detailed example on gene-level CNA prediction is provided, using the open-access TCGA pan-glioma data. In this example you will see how the models are formulated and trained, as well as their performance in testing. We also show how you can extract the feature genes in the models.

Dependencies

The models are trained on the TCGA Pancancer Atlas data, using glmnet package in R. The dependency requirements are automatically solved while running the program.

Contact

For technical issues please send an email to [email protected] or [email protected].

About

[Source] CNAPE: A Software for Copy Number Alteration Prediction from Gene Expression in Human Cancers

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published