You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I got this error when I run methratio.py to parse bam file:
[methratio] @mon Oct 10 11:52:28 2022 Using 90% of available memory (866 MB) as limit
[methratio] @mon Oct 10 11:52:28 2022 Presorting inputs
[methratio] @mon Oct 10 11:52:29 2022 Calling samtools sort on SRR800059.uniq.bam and using 36 MB of memory
[bam_sort_core] merging from 792 files and 24 in-memory blocks...
[methratio] @mon Oct 10 12:04:41 2022 Processing 1 chromosomes at a time
[methratio] @mon Oct 10 12:04:43 2022 Reading CM002885.2 from SRR800059.uniq.tmpSrt.bam with samtools
Traceback (most recent call last):
File "methratio.py", line 588, in
main()
File "methratio.py", line 143, in main
ret = map(chromWorker, argList)
File "methratio.py", line 293, in chromWorker
options.rm_dup, options.trim_fillin, coverage, chromSize)
File "methratio.py", line 467, in get_sam_alignment
seq = parseCigar(seq, cigar)
File "methratio.py", line 457, in parseCigar
assert originalLen == index, "String length does not match CIGAR"
AssertionError: String length does not match CIGAR
So I checked the script and I find this function (line 432 - 458):
def parseCigar(seq, cigar):
'''
Delete insertions and dash "-" deletions
>>> parseCigar('ACTAGAATGGCT','3M1I3M1D5M')
'ACTGAA-TGGCT'
>>> parseCigar('ACTG','2M')
Traceback (most recent call last):
...
AssertionError: String length does not match CIGAR
'''
index = 0 # index in seq
originalLen = len(seq)
cigarMatch = cigarRE.findall(cigar)
for align in cigarMatch:
length = int(align[:-1])
op = align[-1]
if op == 'M':
index += length
elif op == 'I':
seq = seq[:index]+seq[index+length:]
elif op == 'D':
seq = seq[:index]+'-'*length+seq[index:]
index += length
else:
raise ValueError("%c not a valid CIGAR character"%(op))
assert originalLen == index, "String length does not match CIGAR"
return seq
Where I felt a little strange about "assert originalLen == index". In my understanding, "index" == "M" number + "D" number, rather than "M" number + "I" number (i.e. "originalLen"). So, "originalLen" won't equal to "index" unless the "I" number equal to the "D" number.
Is this a BUG, or have I misunderstood?
The text was updated successfully, but these errors were encountered:
Hi, I got this error when I run methratio.py to parse bam file:
[methratio] @mon Oct 10 11:52:28 2022 Using 90% of available memory (866 MB) as limit
[methratio] @mon Oct 10 11:52:28 2022 Presorting inputs
[methratio] @mon Oct 10 11:52:29 2022 Calling samtools sort on SRR800059.uniq.bam and using 36 MB of memory
[bam_sort_core] merging from 792 files and 24 in-memory blocks...
[methratio] @mon Oct 10 12:04:41 2022 Processing 1 chromosomes at a time
[methratio] @mon Oct 10 12:04:43 2022 Reading CM002885.2 from SRR800059.uniq.tmpSrt.bam with samtools
Traceback (most recent call last):
File "methratio.py", line 588, in
main()
File "methratio.py", line 143, in main
ret = map(chromWorker, argList)
File "methratio.py", line 293, in chromWorker
options.rm_dup, options.trim_fillin, coverage, chromSize)
File "methratio.py", line 467, in get_sam_alignment
seq = parseCigar(seq, cigar)
File "methratio.py", line 457, in parseCigar
assert originalLen == index, "String length does not match CIGAR"
AssertionError: String length does not match CIGAR
So I checked the script and I find this function (line 432 - 458):
def parseCigar(seq, cigar):
'''
Delete insertions and dash "-" deletions
>>> parseCigar('ACTAGAATGGCT','3M1I3M1D5M')
'ACTGAA-TGGCT'
>>> parseCigar('ACTG','2M')
Traceback (most recent call last):
...
AssertionError: String length does not match CIGAR
'''
index = 0 # index in seq
originalLen = len(seq)
cigarMatch = cigarRE.findall(cigar)
for align in cigarMatch:
length = int(align[:-1])
op = align[-1]
if op == 'M':
index += length
elif op == 'I':
seq = seq[:index]+seq[index+length:]
elif op == 'D':
seq = seq[:index]+'-'*length+seq[index:]
index += length
else:
raise ValueError("%c not a valid CIGAR character"%(op))
assert originalLen == index, "String length does not match CIGAR"
return seq
Where I felt a little strange about "assert originalLen == index". In my understanding, "index" == "M" number + "D" number, rather than "M" number + "I" number (i.e. "originalLen"). So, "originalLen" won't equal to "index" unless the "I" number equal to the "D" number.
Is this a BUG, or have I misunderstood?
The text was updated successfully, but these errors were encountered: