Skip to content

lingcheng99/breast-cancer-gene-signature

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

breast-cancer-gene-signature


The motivation of this study is to predict clinical outcome of breast cancer patients using gene expression profile from microarray. Breast cancer patients with the same stage of diseases can have markedly different outcomes. Traditional predictors such as lymph node status, tumor size, and tumor grade suffer from both high false-positive and high false-negative rates. Microarray is a powerful technique that can capture a snapshot of gene expression profile of thousands of genes in each individual. Using machine learning algorithms, microarray analysis can identify molecular subtypes associated with different clinical outcomes, and has been proven to improve the diagnosis and risk stratification of many types of cancer. This study, using gene expression profile of ~25,000 genes from 337 breast cancer patients, applied supervised classification, and identifeid a 70-gene signature associated with poor prognosis (i.e. metastases within five years). This gene signature outperformed all other clinical parameters in predicting disease outcomes.

The dataset is from a breast cancer study published by van't Veer et al. in 2002, available in bioconductor.During data exploration, I examined the correlation between disease outcome and estrogen receptor status, BRCA mutation, lymph node, and tumor grade, with statistical tests and ggplot. For modeling, I used randomforest and svm. Based on ROC curve, SVM is clearly better than random forest.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published