-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FASTA file error message #56
Comments
Looks like a minor format incompatibility.
Fasta files are required to start with a ">" sign. It might also have to do
with all that other stuff after the chromosome name.
In the bamfile, what are your chromosomes named? Is it 1,2,3,... Or is it
chr1,chr2, etc? Different sources use different chromosome name formats.
On Feb 8, 2018 8:00 AM, "flissp" <[email protected]> wrote:
Hi
I've just tried running the java version of QoRTs using the current Ensembl
primary assembly (GRCh38, v91, downloaded from the Ensembl ftp) fasta file
which I also used to align my test BAM file (using STAR - no problems),
however, QoRTs falls over as it says it can't find chromosome 1 in the
fasta file, even though I can see that it is present. Could this be an
Ensembl format compatibility issue?:
Head of the fasta reference file:
1 dna:chromosome chromosome:GRCh38:1:1:248956422:1 REF
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
QoRTs error message:
<====== FATAL ERROR! ======>
Error message: "FATAL ERROR: Cannot find chromosome "1" in genome FASTA file!"
Stack Trace:
java.lang.Thread.getStackTrace(Thread.java:1552)
internalUtils.Reporter$.error(Reporter.scala:294)
internalUtils.genomicAnnoUtils$EfficientGenomeSeqContainer_MFA.switchToChrom(genomicAnnoUtils.scala:170)
internalUtils.genomicAnnoUtils$EfficientGenomeSeqContainer.shiftBufferTo(genomicAnnoUtils.scala:111)
qcUtils.qcOverlapMatch.runOnReadPair(qcOverlapMatch.scala:218)
qcUtils.qcOverlapMatch.runOnReadPair(qcOverlapMatch.scala:83)
qcUtils.runAllQC$.$anonfun$runOnSeqFile$7(runAllQC.scala:1312)
qcUtils.runAllQC$.$anonfun$runOnSeqFile$7$adapted(runAllQC.scala:1285)
qcUtils.runAllQC$$$Lambda$188/576936864.apply(Unknown Source)
scala.collection.Iterator.foreach(Iterator.scala:929)
scala.collection.Iterator.foreach$(Iterator.scala:929)
internalUtils.stdUtils$IteratorProgressReporter$$anon$5.foreach(stdUtils.scala:487)
qcUtils.runAllQC$.runOnSeqFile(runAllQC.scala:1285)
qcUtils.runAllQC$.run(runAllQC.scala:960)
qcUtils.runAllQC$allQC_runner.run(runAllQC.scala:672)
runner.runner$.main(runner.scala:97)
runner.runner.main(runner.scala)
<==========================>
Exception in thread "main" java.lang.Exception: FATAL ERROR: Cannot find
chromosome "1" in genome FASTA file!
at internalUtils.Reporter$.error(Reporter.scala:299)
at internalUtils.genomicAnnoUtils$EfficientGenomeSeqContainer_
MFA.switchToChrom(genomicAnnoUtils.scala:170)
at internalUtils.genomicAnnoUtils$EfficientGenomeSeqContainer.shiftBufferTo(
genomicAnnoUtils.scala:111)
at qcUtils.qcOverlapMatch.runOnReadPair(qcOverlapMatch.scala:218)
at qcUtils.qcOverlapMatch.runOnReadPair(qcOverlapMatch.scala:83)
at qcUtils.runAllQC$.$anonfun$runOnSeqFile$7(runAllQC.scala:1312)
at qcUtils.runAllQC$.$anonfun$runOnSeqFile$7$adapted(runAllQC.scala:1285)
at qcUtils.runAllQC$$$Lambda$188/576936864.apply(Unknown Source)
at scala.collection.Iterator.foreach(Iterator.scala:929)
at scala.collection.Iterator.foreach$(Iterator.scala:929)
at internalUtils.stdUtils$IteratorProgressReporter$$
anon$5.foreach(stdUtils.scala:487)
at qcUtils.runAllQC$.runOnSeqFile(runAllQC.scala:1285)
at qcUtils.runAllQC$.run(runAllQC.scala:960)
at qcUtils.runAllQC$allQC_runner.run(runAllQC.scala:672)
at runner.runner$.main(runner.scala:97)
at runner.runner.main(runner.scala)
Thanks for your help
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#56>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACwu7OzvU37WKcMJpfoomRY_GnMwV8sgks5tSu_TgaJpZM4R-T3k>
.
|
Thanks for the swift response! The fasta file does actually start with ">", I must have accidentally dropped it when I pasted it in (sorry!)... The chromosomes are named 1,2,3 without the "chr" prefix - but as I aligned the BAM using the same fasta file, this matches up with the BAM - or do you mean that QoRTs is specifically looking for the "chr" prefix? If this is the case, is there any option to edit this in QoRTs, or would I need to return to my fasta & BAM files to edit them (which seems a bit of a faff)? Thanks! |
QoRTs doesn't assume any specific chrom format, it just has to be
consistent.
It looks like it doesn't allow the additional chromosome metadata in the
fasta file. You can either strip the metadata, or wait until I compile,
test, and release the buggies, probably by some time next week.
You can strip the metadata from the fasta using sed.
On Feb 8, 2018 8:56 AM, "fp" <[email protected]> wrote:
Thanks for the swift response!
The fasta file does actually start with ">", I must have accidentally
dropped it when I pasted it in (sorry!)...
The chromosomes are named 1,2,3 without the "chr" prefix - but as I aligned
the BAM using the same fasta file, this matches up with the BAM - or do you
mean that QoRTs is specifically looking for the "chr" prefix?
If this is the case, is there any option to edit this in QoRTs, or would I
need to return to my fasta & BAM files to edit them (which seems a bit of a
faff)?
Thanks!
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#56 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACwu7PcMztVWTjMSUoYvbCi1PUpi_VsMks5tSv0YgaJpZM4R-T3k>
.
|
Ah OK, thanks - will give that a go while I wait for the next release! |
Any update to this? I am experiencing the same issue with the Ensembl genome FASTA files. Thanks. |
Hi
I've just tried running the java version of QoRTs using the current Ensembl primary assembly (GRCh38, v91, downloaded from the Ensembl ftp) fasta file which I also used to align my test BAM file (using STAR - no problems), however, QoRTs falls over as it says it can't find chromosome 1 in the fasta file, even though I can see that it is present. Could this be an Ensembl format compatibility issue?:
Head of the fasta reference file:
QoRTs error message:
<====== FATAL ERROR! ======>
<==========================>
Exception in thread "main" java.lang.Exception: FATAL ERROR: Cannot find chromosome "1" in genome FASTA file!
at internalUtils.Reporter$.error(Reporter.scala:299)
at internalUtils.genomicAnnoUtils$EfficientGenomeSeqContainer_MFA.switchToChrom(genomicAnnoUtils.scala:170)
at internalUtils.genomicAnnoUtils$EfficientGenomeSeqContainer.shiftBufferTo(genomicAnnoUtils.scala:111)
at qcUtils.qcOverlapMatch.runOnReadPair(qcOverlapMatch.scala:218)
at qcUtils.qcOverlapMatch.runOnReadPair(qcOverlapMatch.scala:83)
at qcUtils.runAllQC$.$anonfun$runOnSeqFile$7(runAllQC.scala:1312)
at qcUtils.runAllQC$.$anonfun$runOnSeqFile$7$adapted(runAllQC.scala:1285)
at qcUtils.runAllQC$$$Lambda$188/576936864.apply(Unknown Source)
at scala.collection.Iterator.foreach(Iterator.scala:929)
at scala.collection.Iterator.foreach$(Iterator.scala:929)
at internalUtils.stdUtils$IteratorProgressReporter$$anon$5.foreach(stdUtils.scala:487)
at qcUtils.runAllQC$.runOnSeqFile(runAllQC.scala:1285)
at qcUtils.runAllQC$.run(runAllQC.scala:960)
at qcUtils.runAllQC$allQC_runner.run(runAllQC.scala:672)
at runner.runner$.main(runner.scala:97)
at runner.runner.main(runner.scala)
Thanks for your help
The text was updated successfully, but these errors were encountered: