Site2Target #3666

peymanzarrineh · 2024-11-28T14:21:32Z

Update the following URL to point to the GitHub repository of
the package you wish to submit to Bioconductor

Repository: https://github.com/fls-bioinformatics-core/Site2Target

Confirm the following by editing each check box to '[x]'

I understand that by submitting my package to Bioconductor,
the package source and all review commentary are visible to the
general public.
I have read the Bioconductor Package Submission
instructions. My package is consistent with the Bioconductor
Package Guidelines.
I understand Bioconductor Package Naming Policy and acknowledge
Bioconductor may retain use of package name.
I understand that a minimum requirement for package acceptance
is to pass R CMD check and R CMD BiocCheck with no ERROR or WARNINGS.
Passing these checks does not result in automatic acceptance. The
package will then undergo a formal review and recommendations for
acceptance regarding other Bioconductor standards will be addressed.
My package addresses statistical or bioinformatic issues related
to the analysis and comprehension of high throughput genomic data.
I am committed to the long-term maintenance of my package. This
includes monitoring the support site for issues that users may
have, subscribing to the bioc-devel mailing list to stay aware
of developments in the Bioconductor community, responding promptly
to requests for updates from the Core team in response to changes in
R or underlying software.
I am familiar with the Bioconductor code of conduct and
agree to abide by it.

I am familiar with the essential aspects of Bioconductor software
management, including:

The 'devel' branch for new packages and features.
The stable 'release' branch, made available every six
months, for bug fixes.
Bioconductor version control using Git
(optionally via GitHub).

For questions/help about the submission process, including questions about
the output of the automatic reports generated by the SPB (Single Package
Builder), please use the #package-submission channel of our Community Slack.
Follow the link on the home page of the Bioconductor website to sign up.

bioc-issue-bot · 2024-11-28T14:21:36Z

Hi @peymanzarrineh

Thanks for submitting your package. We are taking a quick
look at it and you will hear back from us soon.

The DESCRIPTION file for this package is:

Package: Site2Target
Type: Package
Title: An R package to associate peaks and target genes
Version: 0.99.0
Authors@R: person("Peyman Zarrineh", email="[email protected]", role=c("cre", "aut"), comment = c(ORCID = "0000-0003-4820-4101"))
Description: Statistics implemented for both peak-wise and gene-wise associations. In peak-wise associations, the p-value of the target genes of a given set of peaks are calculated. Negative binomial or Poisson distributions can be used for modeling the unweighted peaks targets and log-nromal can be used to model the weighted peaks. In gene-wise associations a table consisting of a set of genes, mapped to specific peaks, is generated using the given rules.
BugReports: https://github.com/fls-bioinformatics-core/Site2Target/issues
License: GPL-2
Encoding: UTF-8
LazyData: false
Imports: S4Vectors, stats, utils, BiocGenerics, GenomeInfoDb, MASS, IRanges, GenomicRanges
biocViews: Annotation, ChIPSeq, Software, Epigenetics, GeneExpression, GeneTarget
RoxygenNote: 7.3.2
Suggests: BiocStyle, knitr, rmarkdown
VignetteBuilder: knitr

lshep · 2024-12-10T13:50:31Z

When i try to R CMD build your package I'm getting the following ERROR

R CMD build Site2Target 
* checking for file 'Site2Target/DESCRIPTION' ... OK
* preparing 'Site2Target':
* checking DESCRIPTION meta-information ... OK
* installing the package to build vignettes
* creating vignettes ... ERROR
--- re-building ‘Site2Target.Rmd’ using rmarkdown
Could not find bibliography file: /Site2Target.bib
Error running filter /usr/bin/pandoc-citeproc:
Filter returned error status 1
Error: processing vignette 'Site2Target.Rmd' failed with diagnostics:
pandoc document conversion failed with error 83
--- failed re-building ‘Site2Target.Rmd’

SUMMARY: processing the following file failed:
  ‘Site2Target.Rmd’

Error: Vignette re-building failed.
Execution halted

peymanzarrineh · 2024-12-10T14:00:36Z

Hello,

I got this error but I assumed it is about pandoc versioning. I could make .html file without any problem from .Rmd file. Do you know what is the problem?

Thank you,

lshep · 2024-12-10T14:24:55Z

I think you can just do bibliography: Site2Target.bib instead of bibliography: "`r file.path(system.file('vignettes', package = 'Site2Target'), 'Site2Target.bib')`". If that does not work I would ask on the [email protected]

peymanzarrineh · 2024-12-10T22:49:04Z

Thank you very much! that solved the problem. Now I don't get error or warning in check and BiocCheck. now my problem is that I cannot push a new version to the Bioconductor. I do:

git remote add origin https://github.com/fls-bioinformatics-core/Site2Target.git
git remote add upstream [email protected]:packages/Site2Target.git
git remote -v
git fetch --all

"error: Could not fetch upstream"

It seems " [email protected]:packages/Site2Target" does not exists. Should I make a new issue and start from beginning?

lshep · 2024-12-11T12:19:42Z

it is not yet on bioconductor. this is still in the preview stage. I will check your github repo for the updates soon

lshep · 2024-12-18T13:48:33Z

I'll move this forward to building but please correct the following before an indepth review

Please don't use exportPattern("^[[:alpha:]]+") you should be selectively
importing and exporting. Please provide a complete NAMESPACE

Please also provide an inst/scripts directory that describes how the data
in inst/extdata was generated. It can be code, pseudo-code, or text but
should minimally list any source or licensing information.

peymanzarrineh · 2024-12-18T16:14:48Z

Thank you for your reply. Strangely for my previous package the export part has been taken care by Roxygen but not for this one. I will do it.
Just my problem is with the second part that was not asked in my previous package as well. So I just made small datasets of our publicly available data on GEO which the paper will be appeared soon (almost accepted) and a publicly available HiC data. Just I reduced the files to Chr21 which is a very small chromosome. I do not have much code to share. Can you give me some example of inst/script? or may be readme file which explains data?

lshep · 2024-12-18T16:18:08Z

If you put the proper import / importFrom / export lines in documentation, maybe you just need to run devtools::document() to have the new NAMESPACE generate?

Even just a text describing what you say here, a reference that its from GEO, so we know it can be distributed. Ideally if someone wanted to generate their own they know where to get started

bioc-issue-bot · 2024-12-18T17:29:16Z

Your package has been added to git.bioconductor.org to continue the
pre-review process. A build report will be posted shortly. Please
fix any ERROR and WARNING in the build report before a reviewer is
assigned or provide a justification on why you feel the ERROR or
WARNING should be granted an exception.

IMPORTANT: Please read this documentation for setting
up remotes to push to git.bioconductor.org. All changes should be
pushed to git.bioconductor.org moving forward. It is required to push a
version bump to git.bioconductor.org to trigger a new build report.

Bioconductor utilized your github ssh-keys for git.bioconductor.org
access. To manage keys and future access you may want to active your
Bioconductor Git Credentials Account

bioc-issue-bot · 2024-12-18T17:34:58Z

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

On one or more platforms, the build results were: "ERROR".
This may mean there is a problem with the package that you need to fix.
Or it may mean that there is a problem with the build system itself.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder:
Linux (Ubuntu 24.04.1 LTS): Site2Target_0.99.1.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/Site2Target to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.

peymanzarrineh · 2024-12-20T17:11:29Z

Thank you! I fixed your notes and the error which was the gitignore thing.
Now I have problem with pushing the newversion

I did
git remote add origin https://github.com/fls-bioinformatics-core/Site2Target.git

git remote add upstream [email protected]:packages/Site2Target.git

git remote -v

git fetch --all

I see:
origin https://github.com/fls-bioinformatics-core/Site2Target.git (fetch)
origin https://github.com/fls-bioinformatics-core/Site2Target.git (push)
upstream [email protected]:packages/Site2Target.git (fetch)
upstream [email protected]:packages/Site2Target.git (push)

which is correct but the rest of things do not work

git merge upstream/mster
git merge origin/main
git push upstream master

Can you help with this? What is the problem what should I write?
Thank you

lshep · 2024-12-20T17:15:11Z

Bioconductor does not have a main or master branch. Bioconductor uses devel as the main branch. Please see http://contributions.bioconductor.org/git-version-control.html#new-package-workflow step 5 explains pushes to a branch that has a different name

peymanzarrineh · 2024-12-21T00:42:43Z

Thank you! but I am still stuck. I think it is different from my previous package. Can you help me:

This is right:

git remote -v
origin https://github.com/fls-bioinformatics-core/Site2Target.git (fetch)
origin https://github.com/fls-bioinformatics-core/Site2Target.git (push)
upstream [email protected]:packages/Site2Target.git (fetch)
upstream [email protected]:packages/Site2Target.git (push)

This seems right:

git fetch --all
Fetching origin
Fetching upstream

From here everything does not work:

git merge upstream/devel
fatal: refusing to merge unrelated histories

git merge upstream/master
merge: upstream/master - not something we can merge

git push upstream main:devel
error: src refspec main does not match any
error: failed to push some refs to '[email protected]:packages/Site2Target.git'

git push origin main
error: src refspec main does not match any
error: failed to push some refs to 'https://github.com/fls-bioinformatics-core/Site2Target.git'

lshep · 2024-12-23T12:26:42Z

It looks like your default branch is named origin? May I assume the local branch you are on matches that? git branch if it is called origin than the command would be git push upstream origin:devel

peymanzarrineh · 2024-12-23T22:23:00Z

Thank you very much! When I go to Site2Target github it has only one branch and it is called Origin. Whatever I do I get error

git branch

master

git push upstream origin:devel
error: src refspec origin does not match any
error: failed to push some refs to '[email protected]:packages/Site2Target.git'

git push upstream master:devel
To git.bioconductor.org:packages/Site2Target.git
! [rejected] master -> devel (non-fast-forward)
error: failed to push some refs to '[email protected]:packages/Site2Target.git'
hint: Updates were rejected because a pushed branch tip is behind its remote
hint: counterpart. Check out this branch and integrate the remote changes
hint: (e.g. 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

lshep · 2024-12-24T11:49:42Z

Can you do as it suggests and do a git pull and git pull upstream

peymanzarrineh · 2024-12-24T14:45:02Z

Thank you! I am still stuck.
git remote -v
origin https://github.com/fls-bioinformatics-core/Site2Target.git (fetch)
origin https://github.com/fls-bioinformatics-core/Site2Target.git (push)
upstream [email protected]:packages/Site2Target (fetch)
upstream [email protected]:packages/Site2Target (push)

git push upstream devel
Everything up-to-date

git branch

devel
master

git pull
Already up-to-date.

git pull upstream
Already up-to-date.

git push upstream origin:devel
error: src refspec origin does not match any
error: failed to push some refs to 'git.bioconductor.org:packages/Site2Target'

lshep · 2024-12-24T14:46:30Z

per your output there, you are on a branch called devel. Whatever branch it says your on you should use in your call.

peymanzarrineh · 2024-12-24T14:53:36Z

I do not have devel on my local (origin) repository. It just have one branch. The devel one is on the upstream. Something is wrong and i was trying to correct it to do: git push upstream origin:devel

bioc-issue-bot · 2024-12-24T14:55:37Z

Received a valid push on git.bioconductor.org; starting a build for commit id: 972876c8e5c20524025dde329a406ee29195a181

bioc-issue-bot · 2024-12-24T14:58:16Z

Dear Package contributor,

This is the automated single package builder at bioconductor.org.

Your package has been built on the Bioconductor Single Package Builder.

Congratulations! The package built without errors or warnings
on all platforms.

Please see the build report for more details.

The following are build products from R CMD build on the Single Package Builder:
Linux (Ubuntu 24.04.1 LTS): Site2Target_0.99.2.tar.gz

Links above active for 21 days.

Remember: if you submitted your package after July 7th, 2020,
when making changes to your repository push to
[email protected]:packages/Site2Target to trigger a new build.
A quick tutorial for setting up remotes and pushing to upstream can be found here.

bioc-issue-bot · 2025-01-02T12:41:02Z

A reviewer has been assigned to your package for an indepth review.
Please respond accordingly to any further comments from the reviewer.

jianhong · 2025-01-02T17:59:45Z

Package 'Site2Target' Review

Thank you for submitting your package to Bioconductor. The package passed check and build. However there are several things need to be fixed. Please try to answer the comments line by line when you are ready for a second review.
Code: Note: please consider; Important: must be addressed.

The DESCRIPTION file

Important: Depends field is not found in DESCRIPTION.
Important: R version is not clear in DESCRIPTION.

General package development

Important: Consider adding unit tests. We strongly encourage them. See
http://bioconductor.org/developers/how-to/unitTesting-guidelines/.
Important: Consider adding input checking. We strongly encourage them. See https://contributions.bioconductor.org/r-code.html#function-arguments
Important: Consider adding instructions for download or creation for the extra data.

R code

NOTE: :: is not suggested in source code unless you can make sure all the packages are imported. Some people think it is better to keep ::. However, please be aware that you will need to manually double-check the imported items if you make any changes to the DESCRIPTION file during development. My suggestion is to remove one or two repetitions to trigger the dependency check.
Important: 1:n is not suggested in source code. Use seq_along, seq_len or seq.int instead.
- In file R/peakwiseAssociations.R:
  - at line 222 found ' acceptedInds <- c(1:geneNumber)'
  - at line 433 found ' acceptedInds <- c(1:geneNumber)'
- In file R/utils.R:
  - at line 168 found ' chrs <- tmp[(c(1:len)*2-1)]'
  - at line 169 found ' Ranges <- tmp[(c(1:len)*2)]'
  - at line 171 found ' start <- tmp[(c(1:len)*2-1)]'
  - at line 172 found ' end <- tmp[(c(1:len)*2)]'
NOTE: Vectorize: for loops present, try to replace them by *apply funcitons.
- In file R/genewiseAssociations.R:
  - at line 21 found ' for(i in seq_len(queryLen))'
  - at line 304 found ' for(i in seq_len(commonStrandsNumber))'
  - at line 322 found ' for(j in seq_len(currentGeneNumber))'
  - at line 341 found ' for(j in seq_len(currentGeneNumber))'
  - at line 574 found ' for(i in seq_len(mapNumber))'
  - at line 626 found ' for(i in seq_len(mapNumber))'
  - at line 837 found ' for(i in seq_len(commonStrandsNumber))'
  - at line 854 found ' for(j in seq_len(currentPeaklen))'
  - at line 866 found ' for(j in seq_len(currentGeneslen))'
- In file R/peakwiseAssociations.R:
  - at line 55 found ' for(i in seq_len(geneNumber))'
  - at line 396 found ' for(i in seq_len(geneNumber))'
  - at line 563 found ' for(i in seq_len(overlapsNumber))'
  - at line 744 found ' for(i in seq_len(overlapsNumber))'
  - at line 778 found ' for(i in seq_len(geneNumber))'
  - at line 794 found ' for(siteCounter in seq_len(tmpSiteNumber))'
Important: finish TODO list.
- In file R/genewiseAssociations.R:
  - at line 261 found ' # Remove interactions lower than distance ############# <----- This can become a function'
Important: Please consider to add drop=FALSE/TRUE to avoid/secure the reduction of dimension for matrices and arrays. Ignore this if using datatable.
- In file R/genewiseAssociations.R:
  - at line 384 found ' df <- df[tmpInds, ]'
- In file R/utils.R:
  - at line 53 found ' ranges=IRanges::IRanges(Table[,startInd], Table[,endInd]))'
NOTE: Functional programming: code repetition.
- repetition in addColumn2geneWiseAssociation, and addRelation2geneWiseAssociation
  - in addColumn2geneWiseAssociation
    - line 1: function (type = "", name = NULL, coordinates = NULL, columnName = NA,
    - line 2: column, inFile = "geneWiseAssociation", outFile = "geneWiseAssociation")
    - line 3: {
    - line 4: if (is.na(columnName)) {
    - line 5: stop("Column name should be provided")
    - line 6: }
    - line 7: else {
    - line 8: columnName <- removeReserveCharacter(columnName)
    - line 9: }
    - line 10: if (!dir.exists(inFile)) {
    - line 11: stop("The user provided directory does not exist")
    - line 12: }
    - line 13: if (!file.exists(file.path(inFile, "link.tsv"))) {
    - line 14: stop("The gene-peaak link file does not exist in the directory")
    - line 15: }
    - line 16: interactionTable <- utils::read.table(file.path(inFile,
    - line 17: "link.tsv"), header = TRUE, sep = "\t")
    - line 18: if (!file.exists(file.path(inFile, "gene.tsv"))) {
    - line 19: stop("The gene information file does not exist in the directory")
    - line 20: }
    - line 21: geneTable <- utils::read.table(file.path(inFile, "gene.tsv"),
    - line 22: header = TRUE, sep = "\t")
    - line 23: if (!file.exists(file.path(inFile, "peak.tsv"))) {
    - line 24: stop("The peaak file does not exist in the directory")
    - line 25: }
    - line 26: peakTable <- utils::read.table(file.path(inFile, "peak.tsv"),
    - line 27: header = TRUE, sep = "\t")
    - line 114: if (!(dir.exists(outFile))) {
    - line 115: dir.create(outFile)
    - line 116: }
    - line 117: utils::write.table(interactionTable, file = file.path(outFile,
    - line 118: "link.tsv"), row.names = FALSE, col.names = TRUE,
    - line 119: quote = FALSE, sep = "\t")
  - in addRelation2geneWiseAssociation
    - line 1: function (strand1 = NULL, strand2 = NULL, columnName, column,
    - line 2: inFile = "geneWiseAssociation", outFile = "geneWiseAssociation")
    - line 3: {
    - line 4: if (is.na(columnName)) {
    - line 5: stop("Column nam should be provided")
    - line 6: }
    - line 7: else {
    - line 8: columnName <- removeReserveCharacter(columnName)
    - line 9: }
    - line 10: if (!dir.exists(inFile)) {
    - line 11: stop("The user provided directory does not exist")
    - line 12: }
    - line 13: if (!file.exists(file.path(inFile, "link.tsv"))) {
    - line 14: stop("The gene-peaak link file does not exist in the directory")
    - line 15: }
    - line 16: interactionTable <- utils::read.table(file.path(inFile,
    - line 17: "link.tsv"), header = TRUE, sep = "\t")
    - line 18: if (!file.exists(file.path(inFile, "gene.tsv"))) {
    - line 19: stop("The gene information file does not exist in the directory")
    - line 20: }
    - line 21: geneTable <- utils::read.table(file.path(inFile, "gene.tsv"),
    - line 22: header = TRUE, sep = "\t")
    - line 25: if (!file.exists(file.path(inFile, "peak.tsv"))) {
    - line 26: stop("The peaak file does not exist in the directory")
    - line 27: }
    - line 28: peakTable <- utils::read.table(file.path(inFile, "peak.tsv"),
    - line 29: header = TRUE, sep = "\t")
    - line 92: if (!(dir.exists(outFile))) {
    - line 93: dir.create(outFile)
    - line 94: }
    - line 95: utils::write.table(interactionTable, file = file.path(outFile,
    - line 96: "link.tsv"), row.names = FALSE, col.names = TRUE,
    - line 97: quote = FALSE, sep = "\t")
- repetition in addRelation2geneWiseAssociation, extendSitesInGivenRegions, genewiseAssociation, getTargetGenesNumber, getTargetGenesPvals, getTargetGenesPvalsWithDNAInteractions, getTargetGenesPvalsWithIntensities, genewiseAssociation, and getTargetGenesPvalsWithIntensitiesAndDNAInteractions
  - in addRelation2geneWiseAssociation
    - line 46: mapPeak <- GenomicRanges::findOverlaps(peakCoord,
    - line 47: strand1)
    - line 48: mapPeakInds <- S4Vectors::queryHits(mapPeak)
    - line 49: mapPeakStrandInds <- S4Vectors::subjectHits(mapPeak)
    - line 50: mapGene <- GenomicRanges::findOverlaps(geneCoord,
    - line 51: strand2)
    - line 52: mapGeneInds <- S4Vectors::queryHits(mapGene)
    - line 53: mapGeneStrandInds <- S4Vectors::subjectHits(mapGene)
    - line 54: commonStrands <- intersect(mapPeakStrandInds, mapGeneStrandInds)
    - line 55: commonStrandsNumber <- length(commonStrands)
    - line 56: for (i in seq_len(commonStrandsNumber)) {
    - line 57: currentStrand <- commonStrands[i]
    - line 58: strandIndsPeak <- which((mapPeakStrandInds ==
    - line 59: currentStrand) == TRUE)
    - line 60: currentPeakInds <- mapPeakInds[strandIndsPeak]
    - line 61: currentPeak <- peakTable$peakName[currentPeakInds]
    - line 62: strandIndsGene <- which((mapGeneStrandInds ==
    - line 63: currentStrand) == TRUE)
    - line 64: currentGeneInds <- mapGeneInds[strandIndsGene]
    - line 65: currentGene <- geneTable$geneNames[currentGeneInds]
  - in extendSitesInGivenRegions
    - line 1: function (givenRegions, sites, distance = 1e+05)
    - line 2: {
    - line 3: extendRegions <- GenomicRanges::GRanges(seqnames = S4Vectors::Rle(GenomeInfoDb::seqnames(sites)),
    - line 4: ranges = IRanges::IRanges(BiocGenerics::start(sites),
    - line 5: end = BiocGenerics::end(sites)) + distance)
    - line 6: siteRegionOverlap <- GenomicRanges::findOverlaps(sites,
    - line 7: givenRegions)
  - in genewiseAssociation
    - line 32: if (associationBy == "distance") {
    - line 33: extendRegions <- GenomicRanges::GRanges(seqnames = S4Vectors::Rle(GenomeInfoDb::seqnames(peakCoordinates)),
    - line 34: ranges = IRanges::IRanges(BiocGenerics::start(peakCoordinates),
    - line 35: end = BiocGenerics::end(peakCoordinates)) + distance)
    - line 36: }
    - line 37: else if (associationBy == "regions") {
    - line 38: givenRegionNumber <- length(givenRegions)
    - line 39: if (givenRegionNumber == 0) {
    - line 41: }
    - line 42: extendRegions <- extendSitesInGivenRegions(sites = peakCoordinates,
    - line 43: distance = distance, givenRegions = givenRegions)
    - line 44: }
    - line 46: extendRegions <- GenomicRanges::GRanges(seqnames = S4Vectors::Rle(GenomeInfoDb::seqnames(peakCoordinates)),
    - line 47: ranges = IRanges::IRanges(BiocGenerics::start(peakCoordinates),
    - line 48: end = BiocGenerics::end(peakCoordinates)) + distance)
    - line 49: }
    - line 50: else {
    - line 51: stop("Peak to gene is associated either by distance or regions")
    - line 52: }
    - line 53: map <- GenomicRanges::findOverlaps(geneCoordinates, extendRegions)
    - line 78: }
    - line 79: strand1Center <- getCenterOfPeaks(strand1)
    - line 80: center1 <- BiocGenerics::start(strand1Center)
    - line 81: rm(strand1Center)
    - line 82: gc()
    - line 83: strand2Center <- getCenterOfPeaks(strand2)
    - line 84: center2 <- BiocGenerics::start(strand2Center)
    - line 85: rm(strand2Center)
    - line 86: gc()
    - line 87: D <- abs(center1 - center2)
    - line 88: distantInteractomInds <- which((D > (distance - 1)) ==
    - line 89: TRUE)
    - line 90: strand1 <- strand1[distantInteractomInds]
  - in getTargetGenesNumber
    - line 1: sites = NA, distance = 50000)
    - line 2:{
    - line 3: geneNumber <- length(geneCoordinates)
    - line 4: if (geneNumber < 2) {
    - line 5: stop("At least two genes corrdinats must be provided")
    - line 6: }
    - line 7: siteNumber <- length(sites)
    - line 8: if (siteNumber < 2) {
    - line 9: stop("At least two sites must be provided")
    - line 10: }
    - line 11: extendRegions <- GenomicRanges::GRanges(seqnames = S4Vectors::Rle(GenomeInfoDb::seqnames(sites)),
    - line 12: ranges = IRanges::IRanges(BiocGenerics::start(sites),
    - line 13: end = BiocGenerics::end(sites)) + distance)
    - line 14: targets <- S4Vectors::queryHits(GenomicRanges::findOverlaps(geneCoordinates,
    - line 15: extendRegions))
  - in getTargetGenesPvals
    - line 2: distance = 50000, givenRegions = NA)
    - line 3:{
    - line 4: geneNumber <- length(geneCoordinates)
    - line 5: if (geneNumber < 2) {
    - line 6: stop("At least two genes corrdinats must be provided")
    - line 7: }
    - line 8: siteNumber <- length(sites)
    - line 9: if (siteNumber < 2) {
    - line 10: stop("At least two sites must be provided")
    - line 11: }
    - line 12: sites <- getCenterOfPeaks(sites)
    - line 13: if (associationBy == "distance") {
    - line 16: }
    - line 17: else if (associationBy == "regions") {
    - line 18: givenRegionNumber <- length(givenRegions)
    - line 19: if (givenRegionNumber < 2) {
    - line 20: if (is.na(givenRegions)) {
    - line 21: stop("For extending sites in regions, the regions must be provided")
    - line 22: }
    - line 23: }
    - line 24: extendRegions <- extendSitesInGivenRegions(sites = sites,
    - line 25: distance = distance, givenRegions = givenRegions)
    - line 28: }
    - line 29: else {
    - line 30: stop("Peak to gene is associated either by distance or regions")
    - line 31: }
    - line 32: eps <- 1
    - line 33: log2ScaleCount <- log2(targetNumber + eps)
    - line 34: upperbound <- 2^(ceiling(stats::quantile(log2ScaleCount,
    - line 35: 0.75) + 3 * stats::IQR(log2ScaleCount)))
    - line 36: if (upperbound < 4) {
    - line 37: warning("Insufficeint interactions to model")
    - line 38: acceptedInds <- c(1:geneNumber)
    - line 39: }
    - line 40: else {
    - line 41: acceptedInds <- which((targetNumber < upperbound) ==
    - line 42: TRUE)
    - line 43: }
    - line 44: if (dist == "negative binomial") {
    - line 45: flag <- TRUE
    - line 46: try({
    - line 47: distNB <- MASS::fitdistr(targetNumber[acceptedInds],
    - line 48: densfun = "negative binomial")
    - line 49: pvals <- stats::pnbinom(targetNumber, size = (distNB$estimate)[1],
    - line 50: mu = (distNB$estimate)[2], lower.tail = FALSE)
    - line 51: flag <- FALSE
    - line 52: })
    - line 53: if (flag) {
    - line 54: stop("negative binomial distribution could not be fitted try poisson")
    - line 55: }
    - line 56: }
    - line 57: else if (dist == "poisson") {
    - line 58: distP <- MASS::fitdistr(targetNumber[acceptedInds], densfun = "poisson")
    - line 59: pvals <- stats::ppois(targetNumber, lambda = as.numeric(distP[1]),
    - line 60: lower.tail = FALSE)
    - line 61: }
    - line 62: else {
    - line 63: stop("The distribution should be either negative binomial or poisson")
    - line 64: }
    - line 65: return(pvals)
  - in getTargetGenesPvalsWithDNAInteractions
    - line 3: {
    - line 4: geneNumber <- length(geneCoordinates)
    - line 5: if (geneNumber < 2) {
    - line 6: stop("At least two genes corrdinats must be provided")
    - line 7: }
    - line 8: siteNumber <- length(sites)
    - line 9: if (siteNumber < 2) {
    - line 10: stop("At least two sites must be provided")
    - line 11: }
    - line 12: LenStrand1 <- length(strand1)
    - line 13: LenStrand2 <- length(strand2)
    - line 14: if (LenStrand1 < 2) {
    - line 15: stop("At least two DNA-DNA interactions must be provided")
    - line 16: }
    - line 17: if (LenStrand1 != LenStrand2) {
    - line 18: stop("The length of Gstrand and Sstrand must be equal")
    - line 19: }
    - line 20: sites <- getCenterOfPeaks(sites)
    - line 21: targetNumber <- rep(0, geneNumber)
    - line 22: if (distance > -1) {
    - line 23: strand1Center <- getCenterOfPeaks(strand1)
    - line 24: center1 <- BiocGenerics::start(strand1Center)
    - line 25: rm(strand1Center)
    - line 26: gc()
    - line 27: strand2Center <- getCenterOfPeaks(strand2)
    - line 28: center2 <- BiocGenerics::start(strand2Center)
    - line 29: rm(strand2Center)
    - line 30: gc()
    - line 31: D <- abs(center1 - center2)
    - line 32: distantInteractomInds <- which((D > (distance - 1)) ==
    - line 33: TRUE)
    - line 34: InteractionNumber <- length(distantInteractomInds)
    - line 35: if (InteractionNumber > 0) {
    - line 36: strand1 <- strand1[distantInteractomInds]
    - line 37: strand2 <- strand2[distantInteractomInds]
    - line 38: }
    - line 41: }
    - line 42: geneCoordinates <- getCenterOfPeaks(geneCoordinates)
    - line 43: InteractionNumber <- LenStrand1
    - line 44: mapSite <- GenomicRanges::findOverlaps(sites, strand1)
    - line 45: mapSiteInds <- S4Vectors::queryHits(mapSite)
    - line 46: mapSiteStrandInds <- S4Vectors::subjectHits(mapSite)
    - line 47: mapGene <- GenomicRanges::findOverlaps(geneCoordinates,
    - line 48: strand2)
    - line 49: mapGeneInds <- S4Vectors::queryHits(mapGene)
    - line 50: mapGeneStrandInds <- S4Vectors::subjectHits(mapGene)
    - line 51: commonStrands <- intersect(mapSiteStrandInds, mapGeneStrandInds)
    - line 52: targetNumberDNAIntacts <- rep(0, geneNumber)
    - line 53: for (i in seq_len(geneNumber)) {
    - line 54: tmpGeneInds <- which((mapGeneInds == i) == TRUE)
    - line 55: tmpGeneIndsNum <- length(tmpGeneInds)
    - line 56: if (tmpGeneIndsNum > 0) {
    - line 57: tmpStrandInds <- mapGeneStrandInds[tmpGeneInds]
    - line 58: intersect(tmpStrandInds, commonStrands)
    - line 59: tmpSiteInds <- match(tmpStrandInds, mapSiteStrandInds)
    - line 60: tmpSiteInds <- tmpSiteInds[!is.na(tmpSiteInds)]
    - line 63: }
    - line 64: }
    - line 65: targetNumber <- targetNumber + targetNumberDNAIntacts
    - line 66: eps <- 1
    - line 67: log2ScaleCount <- log2(targetNumber + eps)
    - line 84: mu = (distNB$estimate)[2], lower.tail = FALSE)
    - line 85: flag <- FALSE
    - line 86: })
    - line 87: if (flag) {
  - in getTargetGenesPvalsWithIntensities
    - line 2: sites = NA, distance = 50000, givenRegions = NA)
    - line 3: {
    - line 4: geneNumber <- length(geneCoordinates)
    - line 5: if (geneNumber < 10) {
    - line 6: stop("At least ten genes corrdinats must be provided")
    - line 7: }
    - line 8: siteNumber <- length(sites)
    - line 9: if (siteNumber < 10) {
    - line 10: stop("At least ten sites must be provided")
    - line 11: }
    - line 12: sites <- getCenterOfPeaks(sites)
    - line 13: if (associationBy == "distance") {
    - line 14: extendRegions <- GenomicRanges::GRanges(seqnames = S4Vectors::Rle(GenomeInfoDb::seqnames(sites)),
    - line 15: ranges = IRanges::IRanges(BiocGenerics::start(sites),
    - line 16: end = BiocGenerics::end(sites)) + distance)
    - line 17: }
    - line 18: else if (associationBy == "regions") {
    - line 19: givenRegionNumber <- length(givenRegions)
    - line 20: if (givenRegionNumber < 2) {
    - line 21: if (is.na(givenRegions)) {
    - line 22: stop("For extending sites in regions, the regions must be provided")
    - line 23: }
    - line 24: }
    - line 25: extendRegions <- extendSitesInGivenRegions(sites = sites,
    - line 26: distance = distance, givenRegions = givenRegions)
    - line 27: }
    - line 28: else {
    - line 29: stop("Peak to gene is associated either by distance or regions")
    - line 30: }
    - line 31: overlaps <- GenomicRanges::findOverlaps(geneCoordinates,
    - line 32: extendRegions)
    - line 34: overlapsNumber <- length(overlaps)
    - line 35: if (overlapsNumber > 10) {
    - line 36: for (i in seq_len(overlapsNumber)) {
    - line 37: targetNumber[S4Vectors::queryHits(overlaps[i])] <- targetNumber[S4Vectors::queryHits(overlaps[i])] +
    - line 38: intensities[S4Vectors::subjectHits(overlaps[i])]
    - line 39: }
    - line 40: }
    - line 41: else {
    - line 42: stop("Genes and sites are far from each other")
    - line 43: }
    - line 44: eps <- 1
    - line 45: log2ScaleCount <- log2(targetNumber + eps)
    - line 46: nonZeroInds <- which((log2ScaleCount > 0) == TRUE)
    - line 47: lowerbound <- ceiling(stats::quantile(log2ScaleCount[nonZeroInds],
    - line 48: 0.25) - 1.5 * stats::IQR(log2ScaleCount[nonZeroInds]))
    - line 49: upperbound <- ceiling(stats::quantile(log2ScaleCount[nonZeroInds],
    - line 50: 0.75) + 1.5 * stats::IQR(log2ScaleCount[nonZeroInds]))
    - line 51: acceptedInds <- intersect(which((log2ScaleCount < upperbound) ==
    - line 52: TRUE), which((log2ScaleCount > lowerbound) == TRUE))
    - line 53: flag <- TRUE
    - line 54: try({
    - line 55: distN <- MASS::fitdistr(log2ScaleCount[acceptedInds],
    - line 56: densfun = "normal")
    - line 57: pvals <- stats::pnorm(log2ScaleCount, mean = (distN$estimate)[1],
    - line 58: sd = (distN$estimate)[2], lower.tail = FALSE)
    - line 59: flag <- FALSE
    - line 60: })
    - line 61: if (flag) {
    - line 62: warning("Low number of sites and genes")
    - line 63: try({
    - line 64: distN <- MASS::fitdistr(log2ScaleCount, densfun = "normal")
    - line 65: pvals <- stats::pnorm(log2ScaleCount, mean = (distN$estimate)[1],
    - line 66: sd = (distN$estimate)[2], lower.tail = FALSE)
    - line 67: flag <- FALSE
    - line 68: })
    - line 69: }
    - line 70: if (flag) {
    - line 71: stop("Cannot fit the log-normal distirbution. Use more sites and genes")
    - line 72: }
    - line 73: return(pvals)
  - in genewiseAssociation
    - line 98: mapGeneInds <- S4Vectors::queryHits(mapGene)
    - line 99: mapGeneStrandInds <- S4Vectors::subjectHits(mapGene)
    - line 100: commonStrands <- intersect(mapPeakStrandInds, mapGeneStrandInds)
    - line 101: commonStrandsNumber <- length(commonStrands)
    - line 102: geneNamesDistal <- NULL
    - line 104: distanceDistal <- NULL
    - line 105: for (i in seq_len(commonStrandsNumber)) {
    - line 106: currentStrand <- commonStrands[i]
    - line 107: strandIndsPeak <- which((mapPeakStrandInds == currentStrand) ==
    - line 108: TRUE)
    - line 109: currentPeakInds <- mapPeakInds[strandIndsPeak]
    - line 110: currentPeak <- peakNames[currentPeakInds]
    - line 112: currentPeakNumber <- length(currentPeak)
    - line 113: strandIndsGene <- which((mapGeneStrandInds == currentStrand) ==
    - line 114: TRUE)
    - line 115: currentGeneInds <- mapGeneInds[strandIndsGene]
    - line 116: currentGene <- geneNames[currentGeneInds]
  - in getTargetGenesPvalsWithIntensitiesAndDNAInteractions
    - line 3: {
    - line 4: geneNumber <- length(geneCoordinates)
    - line 5: if (geneNumber < 10) {
    - line 6: stop("At least ten genes corrdinats must be provided")
    - line 7: }
    - line 8: siteNumber <- length(sites)
    - line 9: if (siteNumber < 10) {
    - line 10: stop("At least ten sites must be provided")
    - line 11: }
    - line 12: LenStrand1 <- length(strand1)
    - line 13: LenStrand2 <- length(strand2)
    - line 14: if (LenStrand1 < 2) {
    - line 15: stop("At least two DNA-DNA interactions must be provided")
    - line 16: }
    - line 17: if (LenStrand1 != LenStrand2) {
    - line 18: stop("The length of Gstrand and Sstrand must be equal")
    - line 19: }
    - line 20: sites <- getCenterOfPeaks(sites)
    - line 21: targetNumber <- rep(0, geneNumber)
    - line 22: if (distance > -1) {
    - line 23: strand1Center <- getCenterOfPeaks(strand1)
    - line 24: center1 <- BiocGenerics::start(strand1Center)
    - line 25: rm(strand1Center)
    - line 26: gc()
    - line 27: strand2Center <- getCenterOfPeaks(strand2)
    - line 28: center2 <- BiocGenerics::start(strand2Center)
    - line 29: rm(strand2Center)
    - line 30: gc()
    - line 31: D <- abs(center1 - center2)
    - line 32: distantInteractomInds <- which((D > (distance - 1)) ==
    - line 33: TRUE)
    - line 34: InteractionNumber <- length(distantInteractomInds)
    - line 35: if (InteractionNumber > 0) {
    - line 36: strand1 <- strand1[distantInteractomInds]
    - line 37: strand2 <- strand2[distantInteractomInds]
    - line 38: }
    - line 39: extendRegions <- GenomicRanges::GRanges(seqnames = S4Vectors::Rle(GenomeInfoDb::seqnames(sites)),
    - line 40: ranges = IRanges::IRanges(BiocGenerics::start(sites),
    - line 41: end = BiocGenerics::end(sites)) + distance)
    - line 44: overlapsNumber <- length(overlaps)
    - line 45: if (overlapsNumber > 10) {
    - line 46: for (i in seq_len(overlapsNumber)) {
    - line 47: targetNumber[S4Vectors::queryHits(overlaps[i])] <- targetNumber[S4Vectors::queryHits(overlaps[i])] +
    - line 48: intensities[S4Vectors::subjectHits(overlaps[i])]
    - line 49: }
    - line 50: }
    - line 51: else {
    - line 52: stop("Genes and sites are far from each other")
    - line 53: }
    - line 54: }
    - line 55: geneCoordinates <- getCenterOfPeaks(geneCoordinates)
    - line 56: InteractionNumber <- length(strand1)
    - line 57: mapSite <- GenomicRanges::findOverlaps(sites, strand1)
    - line 58: mapSiteInds <- S4Vectors::queryHits(mapSite)
    - line 59: mapSiteStrandInds <- S4Vectors::subjectHits(mapSite)
    - line 60: mapGene <- GenomicRanges::findOverlaps(geneCoordinates,
    - line 61: strand2)
    - line 62: mapGeneInds <- S4Vectors::queryHits(mapGene)
    - line 63: mapGeneStrandInds <- S4Vectors::subjectHits(mapGene)
    - line 64: commonStrands <- intersect(mapSiteStrandInds, mapGeneStrandInds)
    - line 65: targetNumberDNAIntacts <- rep(0, geneNumber)
    - line 66: for (i in seq_len(geneNumber)) {
    - line 67: tmpGeneInds <- which((mapGeneInds == i) == TRUE)
    - line 68: tmpGeneIndsNumber <- length(tmpGeneInds)
    - line 69: if (tmpGeneIndsNumber > 0) {
    - line 70: tmpStrandInds <- mapGeneStrandInds[tmpGeneInds]
    - line 71: intersect(tmpStrandInds, commonStrands)
    - line 72: tmpSiteInds <- match(tmpStrandInds, mapSiteStrandInds)
    - line 73: tmpSiteInds <- tmpSiteInds[!is.na(tmpSiteInds)]
    - line 78: }
    - line 79: }
    - line 80: }
    - line 81: targetNumber <- targetNumber + targetNumberDNAIntacts
    - line 82: eps <- 1
    - line 83: log2ScaleCount <- log2(targetNumber + eps)
    - line 84: nonZeroInds <- which((log2ScaleCount > 0) == TRUE)
    - line 85: lowerbound <- ceiling(stats::quantile(log2ScaleCount[nonZeroInds],
    - line 86: 0.25) - 1.5 * stats::IQR(log2ScaleCount[nonZeroInds]))
    - line 87: upperbound <- ceiling(stats::quantile(log2ScaleCount[nonZeroInds],
    - line 88: 0.75) + 1.5 * stats::IQR(log2ScaleCount[nonZeroInds]))
    - line 89: acceptedInds <- intersect(which((log2ScaleCount < upperbound) ==
    - line 90: TRUE), which((log2ScaleCount > lowerbound) == TRUE))
    - line 91: flag <- TRUE
    - line 92: try({
    - line 93: distN <- MASS::fitdistr(log2ScaleCount[acceptedInds],
    - line 94: densfun = "normal")
    - line 95: pvals <- stats::pnorm(log2ScaleCount, mean = (distN$estimate)[1],
    - line 96: sd = (distN$estimate)[2], lower.tail = FALSE)
    - line 97: flag <- FALSE
    - line 98: })
    - line 99: if (flag) {
    - line 100: warning("Low number of sites and genes")
    - line 101: try({
    - line 102: distN <- MASS::fitdistr(log2ScaleCount, densfun = "normal")
    - line 103: pvals <- stats::pnorm(log2ScaleCount, mean = (distN$estimate)[1],
    - line 104: sd = (distN$estimate)[2], lower.tail = FALSE)
    - line 105: flag <- FALSE
    - line 106: })
    - line 107: }
    - line 108: if (flag) {
    - line 109: stop("Cannot fit the log-normal distirbution. Use more sites and genes")
    - line 110: }
    - line 111: return(pvals)

Documentation

Important: Vignette should have an Installation section.
- rmd file vignettes/Site2Target.Rmd
Important: Please include Bioconductor installation instructions using BiocManager.
- rmd file vignettes/Site2Target.Rmd
Note: Vignette includes motivation for submitting to Bioconductor as part of the abstract/intro of the main vignette.
- rmd file vignettes/Site2Target.Rmd
Note: typos:

WORD	FOUND IN
cooridnates	granges2String.Rd:5
	string2Granges.Rd:5
granes	granges2String.Rd:5
grangess	string2Granges.Rd:16
nromal	description:1
trings	string2Granges.Rd:16

bioc-issue-bot added the 1. awaiting moderation submitted and waiting clearance to access resources label Nov 28, 2024

lshep added the pre-check passed pre-review performed and ready to be added to git label Dec 18, 2024

bioc-issue-bot added the ERROR label Dec 18, 2024

bioc-issue-bot added OK and removed ERROR labels Dec 24, 2024

lshep added 2. review in progress assign a reviewer and a more thorough review of package code and documentation taking place and removed pre-review on bioconductor git and access to on demand build but not assigned reviewer until build report clean labels Jan 2, 2025

bioc-issue-bot assigned jianhong Jan 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Site2Target #3666

Site2Target #3666

peymanzarrineh commented Nov 28, 2024

bioc-issue-bot commented Nov 28, 2024

lshep commented Dec 10, 2024

peymanzarrineh commented Dec 10, 2024

lshep commented Dec 10, 2024 •

edited

Loading

peymanzarrineh commented Dec 10, 2024

lshep commented Dec 11, 2024

lshep commented Dec 18, 2024

peymanzarrineh commented Dec 18, 2024

lshep commented Dec 18, 2024

bioc-issue-bot commented Dec 18, 2024

bioc-issue-bot commented Dec 18, 2024

peymanzarrineh commented Dec 20, 2024

lshep commented Dec 20, 2024

peymanzarrineh commented Dec 21, 2024

lshep commented Dec 23, 2024

peymanzarrineh commented Dec 23, 2024

lshep commented Dec 24, 2024 •

edited

Loading

peymanzarrineh commented Dec 24, 2024

lshep commented Dec 24, 2024

peymanzarrineh commented Dec 24, 2024

bioc-issue-bot commented Dec 24, 2024

bioc-issue-bot commented Dec 24, 2024

bioc-issue-bot commented Jan 2, 2025

jianhong commented Jan 2, 2025

Site2Target #3666

Site2Target #3666

Comments

peymanzarrineh commented Nov 28, 2024

bioc-issue-bot commented Nov 28, 2024

lshep commented Dec 10, 2024

peymanzarrineh commented Dec 10, 2024

lshep commented Dec 10, 2024 • edited Loading

peymanzarrineh commented Dec 10, 2024

lshep commented Dec 11, 2024

lshep commented Dec 18, 2024

peymanzarrineh commented Dec 18, 2024

lshep commented Dec 18, 2024

bioc-issue-bot commented Dec 18, 2024

bioc-issue-bot commented Dec 18, 2024

peymanzarrineh commented Dec 20, 2024

lshep commented Dec 20, 2024

peymanzarrineh commented Dec 21, 2024

lshep commented Dec 23, 2024

peymanzarrineh commented Dec 23, 2024

lshep commented Dec 24, 2024 • edited Loading

peymanzarrineh commented Dec 24, 2024

lshep commented Dec 24, 2024

peymanzarrineh commented Dec 24, 2024

bioc-issue-bot commented Dec 24, 2024

bioc-issue-bot commented Dec 24, 2024

bioc-issue-bot commented Jan 2, 2025

jianhong commented Jan 2, 2025

Package 'Site2Target' Review

The DESCRIPTION file

General package development

R code

Documentation

lshep commented Dec 10, 2024 •

edited

Loading

lshep commented Dec 24, 2024 •

edited

Loading