pipelines_for_ChIP-seq_analysis

Pipelines for ChIP-seq analysis, such as peak calling, differential enrichment detection, and pausing index calculation for PolII.

Overview

Here are some scripts I use for the analysis of ChIP-seq, after the preprocessing of ChIP-seq. So the PROJECT, DATA folders are just the same as the ChIP-seq preprocess pipeline.

Now the pipelines including:

peak calling:
- MACS2
- HOMER
differential enrichment detection:
- diffReps
pausing index of PolII ChIP-seq.

Requirement

bedtools.
MACS2.
HOMER. And don't forget install the genome annotation needed for the analysis.
diffReps.
region_analysis.
samtools.
ggplot2. An R graphic package.
ChIPpeakAnno. A bioconductor package used for annotation and GO analysis.

Install these softwares or packages and make sure the softwares are in $PATH.

Installation

Put all script in bin folders to a place in $PATH or add these folders to $PATH.

Usage

Generally, all these pipelines could be run in this way:

nohup ./A_do.sh &

All parameters or options used in the projects could be edited, in A_do.sh, to fit the demands before running. The position of the files in the folder project_script doesn't matter at all. But I prefer to put them under project/script/pipeline_name folder.

For the organization of projects, I generally follow this paper: A Quick Guide to Organizing Computational Biology Projects. So $DATA are the folder contains *.bam alignment files, while $RESULT folder are the results.

Important:

To make comparisons between two conditions work, please name the bam files in this way:

Say condition A, B, each with 2 replicates, and one DNA input per condition.
Name the files as A_rep1.bam, A_rep2.bam, A_input.bam, B_rep1.bam,
B_rep2.bam, and B_input.bam.The key point is to make the same condition
 samples with common letters and input samples contain "input" or "Input"
 strings. If you use preprocess pipeline in this way, then you just need to edit
 the configurations in A_do.sh

Peak calling

There are two peak calling methods used in this pipeline: MACS2 and HOMER.

HOMER

Pay attention to $STYLE, don't forget to modify it to the right value.

MACS2

If the estimation of the shiftsize failed, then you may just use the estimation from PhantomPeak step of preprocessing.

Differential enrichment detection

diffReps

Parameters could be adjusted to set the cutoff of the calculation. The intermediate files are kept, and user may remove them after the processing.

Pausing index

It is an experimental pipeline, and the annotation of mouse genome is "borrowed" from our ngs.plot project. Users may use the genome they need, and getBEDAnoo.R could be used here to generate the bed files.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Differential_enrichment_detection/diffReps		Differential_enrichment_detection/diffReps
Peak_calling		Peak_calling
PolII_pausing_index		PolII_pausing_index
bin		bin
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pipelines_for_ChIP-seq_analysis

Overview

Requirement

Installation

Usage

Peak calling

HOMER

MACS2

Differential enrichment detection

diffReps

Pausing index

About

Releases

Packages

Languages

License

ny-shao/pipelines_for_ChIP-seq_analysis

Folders and files

Latest commit

History

Repository files navigation

pipelines_for_ChIP-seq_analysis

Overview

Requirement

Installation

Usage

Peak calling

HOMER

MACS2

Differential enrichment detection

diffReps

Pausing index

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages