You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was attempting to map from NCBI to GTDB taxonomy, when building translation multitax was unable to download GTDB metadata
from multitax import GtdbTx, NcbiTx
ncbi = NcbiTx()
gtdb = GtdbTx()
ncbi.build_translation(gtdb)
Exception: One or more files could not be downloaded: https://data.gtdb.ecogenomic.org/releases/latest/ar53_metadata.tar.gz, https://data.gtdb.ecogenomic.org/releases/latest/bac120_metadata.tar.gz
For r214.1, the metadata is no longer a tarball, appears to be a gzipped tsv: bac120_metadata.tsv.gz, ar53_metadta.tsv.gz. Looks like it would need some different handling in build_translation as well as that extract tar members.
I'd be happy to put together a pull request to fix, if you're interested.
The text was updated successfully, but these errors were encountered:
Thanks for reporting. Indeed they changed a while ago. A PR would be great! You have to update the urls and the parsing procedure, the download_files function should be generalized for the gzip only files. Some day ago I fixed this exact bug in another tool, you can use it as an example.
I was attempting to map from NCBI to GTDB taxonomy, when building translation multitax was unable to download GTDB metadata
For r214.1, the metadata is no longer a tarball, appears to be a gzipped tsv: bac120_metadata.tsv.gz, ar53_metadta.tsv.gz. Looks like it would need some different handling in
build_translation
as well as that extract tar members.I'd be happy to put together a pull request to fix, if you're interested.
The text was updated successfully, but these errors were encountered: