You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Can I use snp-sites to process the full FASTA data that I downloaded from GISAID, which has about 1,000,000,000 lines in total for ~2 million SARS-COV-2 genomes?
I run it on my local laptop and the job got killed. I could try to run it on a server. But I would like to confirm with your first that it is something doable. I guess that I only need to run "snp-sites -vp -o output " to output a VCF file. I should NOT specify "-p" because generating a phylip file for ~2 million genomes might take forever.
BTW, I had my PhD study at the Sanger Insitute, from 2012-2015.
Best regadrs,
Jie
The text was updated successfully, but these errors were encountered:
I’d like to use this tool for the same reason, but I’m scare that if the tool calls the SNPs using internal pseudo reference genome (I think the consensus sequence), it makes little sense. e.g. For the GISAID msa file, where the variant D614G is present in almost all sequences, it will be recognised as WT and not as variant.
Dear Andrew:
Can I use snp-sites to process the full FASTA data that I downloaded from GISAID, which has about 1,000,000,000 lines in total for ~2 million SARS-COV-2 genomes?
I run it on my local laptop and the job got killed. I could try to run it on a server. But I would like to confirm with your first that it is something doable. I guess that I only need to run "snp-sites -vp -o output " to output a VCF file. I should NOT specify "-p" because generating a phylip file for ~2 million genomes might take forever.
BTW, I had my PhD study at the Sanger Insitute, from 2012-2015.
Best regadrs,
Jie
The text was updated successfully, but these errors were encountered: