Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

save macs2 fold enrichment signal value in summits #1493

Open
wants to merge 19 commits into
base: release_1.0.2
Choose a base branch
from

Conversation

badoi
Copy link

@badoi badoi commented Jun 25, 2022

Currently, the narrowPeak signal value at the summit isn't saved, and this quick fix allows this metric to be added to the replicate summits .rds file for downstream uses.

rcorces and others added 18 commits March 4, 2021 07:28
Remove issue reports for feature requests and documentation. These have been moved to discussions
change which file to read, grab from narrowPeak file and get equivalent columns of what otherwise would be the summits file, Summits score is 1/10th of narrowPeak score, based on macs2 documentation. To get summits from narrowPeak, add the summit offset in column 10 to the start and end coordinates.
@rcorces rcorces changed the base branch from master to release_1.0.2 June 25, 2022 23:10
@badoi
Copy link
Author

badoi commented Jun 26, 2022

I finished trying this version of the peak calling + reproducible peak clustering and completed without any errors. Shouldn't change underlying peak calling by grabbing the replicate summits from narrowPeakFile from macs2 and storing the extra signalValue column.

rcorces added a commit that referenced this pull request Jun 30, 2022
as implemented in #1493 but putting these commits on `release_1.0.2` instead of master
@rcorces
Copy link
Collaborator

rcorces commented Jun 30, 2022

Thanks for this suggestion.
I made your commits on a different branch dev_narrowPeak which branches from release_1.0.2 instead of master just to maintain consistency. The one thing I wasnt able to confirm in the MACS2 docs is that the summits score is 1/10th of narrowPeak score. But I will test this on the tutorial data to confirm.

@badoi
Copy link
Author

badoi commented Jun 30, 2022

That makes sense. The https://github.com/macs3-project/MACS/blob/master/docs/callpeak.md#output-files says they should be the same, but on my files, I found the scores in summits.bed and peaks.narrowPeak are off by a factor of 10. Yes please check, and thanks for folding this into the newest update of ArchR when it comes out.

@rcorces
Copy link
Collaborator

rcorces commented Jul 6, 2022

@badoi - The scores arent precisely the same in my hands (same approximate values, different decimal precision). But that doesnt appear to affect the downstream reproducible peak set in any noticeable way. Can you describe the downstream uses that this change enables just so that I can contextualize it?

@badoi
Copy link
Author

badoi commented Jul 7, 2022

Thanks Ryan! I'm working on identifying which metrics of open chromatin that are comparable across bulk ATAC, pseudo-bulk ATAC, and single-cell aggregated peak accessibility--especially for the ML sequence prediction models. The peakSignalValue is locally normalized that might be better than the aggregated peak accessibility matrix, so I'm adding the patch to see if that's true now that I have those metrics from pseudo-bulk peak calls. For the typical user, they might not even notice this metric is there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants