-
Notifications
You must be signed in to change notification settings - Fork 248
SRA tools docker
skripche edited this page Dec 15, 2020
·
6 revisions
The NCBI SRA Toolkit is now maintaining a Docker image ncbi/sra-tools
% docker run -t --rm -v $PWD:/output:rw -w /output ncbi/sra-tools fasterq-dump -e 2 -p SRR10985476
Unable to find image 'ncbi/sra-tools:latest' locally
latest: Pulling from ncbi/sra-tools
c9b1b535fdd9: Already exists
0a6856f8fd06: Pull complete
2d9bc7db21a2: Pull complete
3de524257044: Pull complete
Digest: sha256:631578b15625cc5390928772f1bf945847ce2981a81a95042729a47579396099
Status: Downloaded newer image for ncbi/sra-tools:latest
lookup :|-------------------------------------------------- 100%
merge : 16319508
join :|-------------------------------------------------- 100%
concat :|-------------------------------------------------- 100%
spots read : 14,965,183
reads read : 14,965,183
reads written : 14,965,183
Please note these suggested options included in the examples:
- creating a host volume to write to:
-v $PWD:/output:rw
- setting the container working directory to the host volume:
-w /output
Most tools write to the current working directory unless told otherwise, and you probably do not want the tools to write into the container's file system. So, please set the working directory to a host volume.
% docker run -t --rm -v $PWD:/output:rw -w /output ncbi/sra-tools prefetch SRR10985476
2020-06-23T18:07:35 prefetch.2.10.8: 1) Downloading 'SRR10985476'...
2020-06-23T18:07:35 prefetch.2.10.8: Downloading via HTTPS...
2020-06-23T18:07:45 prefetch.2.10.8: HTTPS download succeed
2020-06-23T18:07:45 prefetch.2.10.8: 'SRR10985476' is valid
2020-06-23T18:07:45 prefetch.2.10.8: 1) 'SRR10985476' was downloaded successfully
2020-06-23T18:08:27 prefetch.2.10.8: 'SRR10985476' has 454 unresolved dependencies
2020-06-23T18:08:27 prefetch.2.10.8: 2) Downloading 'ncbi-acc:NC_000001.11?vdb-ctx=refseq'...
2020-06-23T18:08:27 prefetch.2.10.8: Downloading via HTTPS...
2020-06-23T18:08:33 prefetch.2.10.8: HTTPS download succeed
2020-06-23T18:08:33 prefetch.2.10.8: 2) 'ncbi-acc:NC_000001.11?vdb-ctx=refseq' was downloaded successfully
...
2020-06-23T18:10:25 prefetch.2.10.8: 455) Downloading 'ncbi-acc:NW_004504305.1?vdb-ctx=refseq'...
2020-06-23T18:10:25 prefetch.2.10.8: Downloading via HTTPS...
2020-06-23T18:10:25 prefetch.2.10.8: HTTPS download succeed
2020-06-23T18:10:25 prefetch.2.10.8: 455) 'ncbi-acc:NW_004504305.1?vdb-ctx=refseq' was downloaded successfully
% docker run -t --rm -v $PWD:/output:rw -w /output ncbi/sra-tools fasterq-dump -p SRR10985476
lookup :|-------------------------------------------------- 100%
merge : 17976103
join :|-------------------------------------------------- 100%
concat :|-------------------------------------------------- 100%
spots read : 14,965,183
reads read : 14,965,183
reads written : 14,965,183
Please note that both commands are using the same host volume for the working directory. This allows the files that prefetch
retrieved to be found by fasterq-dump
.
We have seen TLS errors when running on AWS, like these:
2020-06-19T15:50:53 prefetch.2.10.7: Downloading via HTTPS...
2020-06-19T15:50:53 prefetch.2.10.7 sys: mbedtls_ssl_get_verify_result returned 0x8 ( !! The certificate is not correctly signed by the trusted CA )
2020-06-19T15:50:53 prefetch.2.10.7 int: connection failed while opening file within cryptographic module - Cannot KClientHttpRequestGET: /scratch/SRR5709848/SRR5709848.sra
2020-06-19T15:50:53 prefetch.2.10.7: HTTPS download failed
The solution is to make the host's certificates visible inside the container:
docker run -v /etc/pki:/etc/pki:ro -v /etc/ssl:/etc/ssl:ro ...