Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to exclude chromsomes/small contigs #8

Open
mirax87 opened this issue Apr 3, 2020 · 2 comments
Open

How to exclude chromsomes/small contigs #8

mirax87 opened this issue Apr 3, 2020 · 2 comments

Comments

@mirax87
Copy link

mirax87 commented Apr 3, 2020

Hi Mahmoud,

currently, I am testing JAMM on GRO-seq (SE protocol, Pol2 ChIP-like pattern) in dm6. This genome version contains a lot of small contigs, which I'd like to ignore/exclude from being analyzed. Thus, muy question, how can I tell JAMM to ignore chromosomes or which data do I need to filter (BED, genome_file)?

Cheers,
-Michael

PS: Sorry, if I missed in the documentation

@mahmoudibrahim
Copy link
Owner

Hi Michael

JAMM considers the chromosomes present in the chromosome size file (-g) only. All other reads are ignored. So you can simply keep in this file the chromosomes you want to analyze only and JAMM will ignore the rest.

Interesting you are using this to call transcripts for GRO-Seq, I would be curious to know what parameters you end up using if you're willing to share this info. You would need to analyze each strand separately because JAMM doesn't call peaks in a stranded manner.

best wishes
Mahmoud

@mirax87
Copy link
Author

mirax87 commented Mar 16, 2021

Hi Mahmoud,

it was about time I came back to you. Time flies...

So, thanks first for the hints - subsetting the -g really speed up the process to a reasonable time (my test runs were too long ago and I don't recall minutes or few hours (< 3 h). ) I agree with the strand-specific peak calling - it's critical, especially in gene-dense organisms like D.melanogaster.

In my tests, I compared MACS2 and JAMM peak calling using 'region' for JAMM and narrowPeak for MACS2 callpeak --nomodel. In the end MACS2 worked slightly better for my purpose (peaks near TSS), for that it was more sensitive.

Overall, it was a close call, but I was under the impression, that both methods are in trouble, when it comes to identifying genes, which are highly transcribed, i.e. show high coverage levels along the entire gene body. Yet, I did not explore all the parameters, I guess the MACS2 --broad could address that. Not sure about JAMM, as I used the JAMM -r region already.

I'll keep looking out for further applications of JAMM in future.

Thanks, sorry for taking so much time, and all the best,
-Michael

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants