Master | Develop |
---|---|
This is python implementation CRISPRcleanR package for unsupervised identification and correction of gene independent cell responses to CRISPR-cas9 targeting
Uses DNAcopy R pcakage to perform CBS[ Circular Binary Segmentation of count data ]
pyCRISPRcleanR
has multiple commands, listed with pyCRISPRcleanR --help
.
Takes the input count data, library file and other associated files/parameters The output is tab separated files for normalised fold changes and inverse transformed corrected treatment counts
Various exceptions can occur for malformed input files.
gRNA Counts
file: tab separated file containing following fields- sgRNA gene <control_count 1...N> <sample_count 1..N>
sgRNA library
file format- sgRNA gene chr start end
results.html
file is generated in the user supplied output folder.
This file contains short description and links for all the result files/folders generated during an analysis workflow.
[please note the number prefix to a file name are in the order of files generated by script and help with grouping similar files]:
- 01_normalised_counts.tsv
- sgRNA: guideRNA
- gene: gene name as defined in the library file
- <control sample count:normalised 1..N> : Normalised count
- <treatment sample count: normalised 1..N> : Normalised count
- 02_normalised_fold_changes.tsv
- sgRNA: guideRNA
- gene: gene name as defined in the library file
- <treatment sample fold chages: fold changes 1..N>
- avgFC: average fold change values
- 03_crispr_cleanr_corrected_counts.tsv [ generated only when
--crispr_cleanr
flag is set ]
- sgRNA: guideRNA
- gene: gene name as defined in the library file
- <control sample count:corrected 1..N> : corrected count
- <treatment sample count:corrected 1..N >: corrected count
- 04_crispr_cleanr_fold_changes.tsv [ generated only when
--crispr_cleanr
flag is set ]
- sgRNA: guideRNA
- gene: gene name as defined in the library file
- <treatment sample fold chages: fold changes 1..N>
- avgFC: average fold change values
- 05_alldata.tsv [ generated only when
--crispr_cleanr
flag is set ]
- sgRNA: guideRNA
- <control sample count: raw 1..N> : raw count
- <treatment sample count: raw 1..N> : raw count
- gene: gene name as defined in the library file
- chr: Chromosome name
- start: gRNA start position
- end: gRNA end position
- <control sample count:normalised 1..N> : Normalised count (postfixed _nc)
- <treatment sample count: normalised 1..N> : Normalised count (postfixed _nc)
- avgFC: average fold change values
- BP: Base pair location ( used for DNAcopy analysis)
- correction: correction factor
- correctedFC: corrected foldchange values
- <control sample count:corrected 1..N> : corrected count (postfixed _cc)
- <treatment sample count:corrected 1..N >: corrected count (postfixed _cc)
- <treatment sample fold chages: fold changes 1..N> (postfixed _cf)
- avgFC_cf: average fold change values based on corrected counts
-
mageckOut [ generated only whem
--run_mageck
flag is set, produces folder containing mageck output for normalised and/or CRISPRcleanR corrected counts] -
bagelOut [ generated only whem
--run_bagel
flag is set, produces folder containing bagel output for normalised and/or CRISPRcleanR corrected counts]
Plotly and pdf plots
- plots based on raw sgRNA counts
- 01_raw_counts_boxplot.html
- 01_raw_counts_histogram.html
- 01_raw_counts_correlation_matrix.html
- plots based on normalised sgRNA counts
- 02_normalised_counts_boxplot.html
- 02_normalised_counts_histogram.html
- 02_normalised_counts_correlation_matrix.html
- plots based on fold changes
- 03_fold_changes_boxplot.html
- 03_fold_changes_histogram.html
- 03_fold_changes_correlation_matrix.html
- stats plots: precision recall and ROC curves based on known tru positive sgRNA/gene set
[generated only when
--gene_signatures
flag is set]
- 04_pr_rc_curve_sgRNA.html
- 04_roc_curve_sgRNA.html
- 05_pr_rc_curve_gene.html
- 05_roc_curve_gene.html
- 06_depletion_profile_genes.html
- plots based on CRISPRcleanR corrected counts
- 07_CRISPRcleanR_corrected_count_boxplot.html
- 07_CRISPRcleanR_corrected_count_histogram.html
- 07_CRISPRcleanR_corrected_count_correlation_matrix.html
- plots based on CRISPRcleanR corrected fold chnages
- 08_CRISPRcleanR_corrected_fold_changes_boxplot.html
- 08_CRISPRcleanR_corrected_fold_changes_histogram.html
- 08_CRISPRcleanR_corrected_fold_changes_correlation_matrix.html
-
09_Raw_vs_postCRISPRcleanR_segmentation_fold_changes.pdf [generated only when
--crispr_cleanr
flag is set] -
Other informative plots
- 10_density_plots_pre_and_post_CRISPRcleanR.html [generated only when
--crispr_cleanr
flag is set] - 11_impact_on_phenotype_barchart.html [generated only when
--run_mageck
flag is set] - 11_impact_on_phenotype_piechart.html [generated only when
--run_mageck
flag is set]
Installing via pip install
. Simply execute with the path to the compiled 'whl' found on the release page:
pip install pyCRISPRcleanR.X.X.X-py3-none-any.whl
Release .whl
files are generated as part of the release process and can be found on the release page
pip
will install the relevant dependancies, listed here for convenience, please refer requirements.txt for versions:
- DNAcopy R packages is required to run
pyCRISPRcleanR
. To facilitate the install process there is a scriptRsupport/libInstall.R
that can be run to build this for you.
Alternatively you can run:
cd Rsupport
./setupR.sh path_to_install_to
Appending 1
to the command to request a complete local build of R
(3.3.0).
This project uses git pre-commit hooks. As these will execute on your system it is entirely up to you if you activate them.
If you want tests, coverage reports and lint-ing to automatically execute before a commit you can activate them by running:
git config core.hooksPath git-hooks
Only a test failure will block a commit, lint-ing is not enforced (but please consider following the guidance).
You can run the same checks manually without a commit by executing the following in the base of the clone:
./run_tests.sh
cd $PROJECTROOT
hash virtualenv || pip3 install virtualenv
virtualenv -p python3 env
source env/bin/activate
python setup.py develop # so bin scripts can find module
For testing/coverage (./run_tests.sh
)
source env/bin/activate # if not already in env
pip install pytest
pip install pytest-cov
Also see Package Dependancies
Make sure the version is incremented in ./setup.py
Generate .whl
source env/bin/activate # if not already
python setup.py bdist_wheel -d dist
Install .whl
# this creates an wheel archive which can be copied to a deployment location, e.g.
scp dist/pyCRISPRcleanR.X.X.X-py3-none-any.whl user@host:~/wheels
# on host
pip install --find-links=~/wheels pyCRISPRcleanR
Iorio F, Behan FM, Gonçalves E, Bhosle SG, Chen E, Shepherd R, Beaver C, Ansari R, Pooley R, Wilkinson P, Harper S, Butler AP, Stronach EA, Saez-Rodriguez J, Yusa K, Garnett MJ. Unsupervised correction of gene-independent cell responses to CRISPR-Cas9 targeting. BMC Genomics. 2018 Aug 13;19(1):604. doi: 10.1186/s12864-018-4989-y.