Confusing NGA50 value reported by metaquast #174
-
Hi, I am using metaquast for metagenome assembly evaluation, but find a confusing NGA50/NG50 value. The confusing point is that Genome fraction (%)=39.9, which is less than 50%, but NGA50=5456, to my understanding, NGA50 or NG50 should be 'none' according to the definition. The reference genome consists of 5 strains(genomes), each of them is about 10kbp. Did I misunderstand something? Any help would greatly be appreciated. The assemblies are generated by Canu, there are 3 contigs, the contig length is 9665, 9668, 5457bp. Here is the command and version I am using: Here is the report:
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi, this is a great question! At the same time, when we compute Side note: I will transfer this issue to the recently opened QUAST Discussion forum! |
Beta Was this translation helpful? Give feedback.
Hi, this is a great question!
The explanation is the following. When we are computing NGA50, we just look at lengths of aligned fragments and select the one such that all alignments fragments of the same or larger length together sum up to at least 50% of the reference length, which is
48492*0.5 = 24246
bp in your case. Note that theTotal aligned length
is24788
, which is just above24246
, so if we take all alignments (i.e., set NGA50 to the shortest alignment length =5456
bp), then we can exceed 50% of the reference length, and thus NGA50 is not none.At the same time, when we compute
Genome fraction
, we check the actual alignment coordinates. So, if two alignments overlap, their genom…