Using Storage

Storage is an interface that we use to abstract away various filesystems and cloud providers. You give it a provider layer path, and then you can download or upload files relative to that path.

Storage provides (python) multithreading capability to accelerate uploads and downloads on http1 connections. You can set the number of threads to use. 0 threads means run everything on the main program thread. If you use too many crashed (between 64 to 128 on my machine) it will crash.

`get_files` Download Performance

We tested get_files on a dual core (NB: python threads only use a single core) 2014 Macbook Pro, 2.4 GHz on a decent wireless connection.

The version tested was commit 26b3606240ca66d7dbe6def33aab4dba7bb316be

Service	Threads	Time (sec)
file	0	0.0036
file	2	0.0039
file	4	0.0037
file	8	0.0053
file	16	0.0045
file	32	0.0058
file	64	0.0070
gs	0	27.8455
gs	1	10.5758
gs	2	4.9513
gs	4	2.5868
gs	8	1.4941
gs	16	0.9418
gs	32	0.7500
gs	64	0.6997
S3	0	10.0914
S3	1	1.6661
S3	2	0.9482
S3	4	0.6604
S3	8	0.5300
S3	16	0.2337
S3	32	0.2419
S3	64	0.4772

The code used to generate the tests is listed below. The command to run the test is:

py.test -s -v python/test/test_storage.py

def test_performance():

    def run(url, num_threads):
        s = Storage(url, n_threads=num_threads)
        content = 'some_string'
        s.put_file('info', content, compress=False)
        s.wait_until_queue_empty()

        start = time.time()
        s.get_files([ 'info' for i in xrange(50) ])
        end = time.time()

        s._kill_threads()

        return end - start


    urls = [
        "file:///tmp/removeme/read_write",
        "gs://neuroglancer/removeme/read_write",
        "s3://neuroglancer/removeme/read_write"
    ]


    for url in urls:
        n_threads = [ 0 ] + [ 2 ** i for i in xrange(0,7) ]
        for num in n_threads:
            delta = run(url, num)
            print url, num, delta

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using Storage

`get_files` Download Performance

Clone this wiki locally

Using Storage

get_files Download Performance

Clone this wiki locally

`get_files` Download Performance