You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Name Version Build Channel
biotradis 1.4.5 hdfd78af_5 bioconda
I noticed that I had an issue with reads with clipping (which I think you've seen before) where reads with soft clipping at the 5' end take the first mapped nucleotide as the insertion site, resulting in an artificially inflated "unique insertions" count. I can see the fix in the Cigar.pm script,
but found that this fix wasn't in the conda-installed hdfd78af_5 build.
I tried a few conda envs with alternate versions with no success, but possibly some user error on my part somewhere, so in any case I downloaded Bio-Tradis-master directly from github to test again.
This time with the correct Cigar.pm, I have a different error. When I view the plot file in artemis, it looks like all insertion sites have been artificially shifted by a consistent factor, while the bam file viewed in IGV is unchanged
I think (but I'm not familiar enough with perl to be sure I've understood the code correctly) the issue is I have reads mapping across the annotation origin, such that their cigar is something like "77H32M". Because these are the first reads encountered, they seem to offset the rest of the data by "77". I wondered if the issue is here
$current_coordinate += $number;
In any case, I wondered if there might be an option to exclude all reads with clipping (hard and soft)? I know it's very conservative, but even when I don't have the issue with annotation-origin-spanning reads, I found that the following
reported false insertion sites for "chimeric reads". As in, if a read had the cigar 40S71M, the insertion coordinate would be adjusted by 40 to give a tn-insertion event 40bp upstream, yet when I look at the individual read it's not a real transposon-junction, and I'm still getting artificially inflated insertion reporting. (This particular library is intentionally not very diverse, so doesn't need much sequencing coverage, but as a consequence is more susceptible to "noise")
I hope this makes sense, I'd be happy to try and explain some more. Maybe this is just specific to my data(?) but posting here just in case. Thanks
The text was updated successfully, but these errors were encountered:
So, there are some issues with handling soft-clipping, and this has been exacerbated by the switch to BWA. I would try using the original default mapper, smalt (--smalt), and setting --smalt_y 1; this should filter out anything that doesn't have an exact match in the genome and will hopefully resolve the problem you're having. It may be worth doing some adapter/quality trimming if you're getting low mapping numbers.
I recently installed biotradis using conda
Name Version Build Channel
biotradis 1.4.5 hdfd78af_5 bioconda
I noticed that I had an issue with reads with clipping (which I think you've seen before) where reads with soft clipping at the 5' end take the first mapped nucleotide as the insertion site, resulting in an artificially inflated "unique insertions" count. I can see the fix in the Cigar.pm script,
but found that this fix wasn't in the conda-installed hdfd78af_5 build.
I tried a few conda envs with alternate versions with no success, but possibly some user error on my part somewhere, so in any case I downloaded Bio-Tradis-master directly from github to test again.
This time with the correct Cigar.pm, I have a different error. When I view the plot file in artemis, it looks like all insertion sites have been artificially shifted by a consistent factor, while the bam file viewed in IGV is unchanged
I think (but I'm not familiar enough with perl to be sure I've understood the code correctly) the issue is I have reads mapping across the annotation origin, such that their cigar is something like "77H32M". Because these are the first reads encountered, they seem to offset the rest of the data by "77". I wondered if the issue is here
In any case, I wondered if there might be an option to exclude all reads with clipping (hard and soft)? I know it's very conservative, but even when I don't have the issue with annotation-origin-spanning reads, I found that the following
reported false insertion sites for "chimeric reads". As in, if a read had the cigar 40S71M, the insertion coordinate would be adjusted by 40 to give a tn-insertion event 40bp upstream, yet when I look at the individual read it's not a real transposon-junction, and I'm still getting artificially inflated insertion reporting. (This particular library is intentionally not very diverse, so doesn't need much sequencing coverage, but as a consequence is more susceptible to "noise")
I hope this makes sense, I'd be happy to try and explain some more. Maybe this is just specific to my data(?) but posting here just in case. Thanks
The text was updated successfully, but these errors were encountered: