s3-streaming: handling (big) S3 files like regular files

Storing, retrieving and using files in S3 is a regular activity so it should be easy. It should also ...

stream the data
have an api that is python file-io like
handle some of the desearization and compression stuff because why not

Install

pip install s3-streaming

Streaming S3 objects like regular files

The basics

Opening and reading S3 objects is similar to regular python io. The only difference is that you need to provide a boto3.session.Session instance to handle the bucket access.

import boto3
from s3streaming import s3_open


with s3_open('s3://bucket/key', boto_session=boto3.session.Session()) as f:
    for next_line in f:
        print(next_line)

Injecting deserialization and compression handling in stream

Consider a file that is gzip compressed and contains lines of json. There's some boilerplate in dealing with that, but why bother? Just handle that in stream.

from s3streaming import s3_open, deserialize, compression


reader_settings = dict(
  boto_session=boto3.session.Session(),
  deserializer=deserialize.json_lines, 
  compression=compression.gzip
)

with s3_open('s3://bucket/key.gzip', **reader_settings) as f:
    for next_line in f:
        print(next_line.keys())    # because the file was decompressed ...
        print(next_line.values())  #   ... and the json is now a loaded dict!

Other deserialize options include

csv
csv_as_dict
tsv
tsv_as_dict
string

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
s3streaming		s3streaming
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

s3-streaming: handling (big) S3 files like regular files

Install

Streaming S3 objects like regular files

The basics

Injecting deserialization and compression handling in stream

About

Releases

Packages

Contributors 3

Languages

License

robhowley/s3-streaming

Folders and files

Latest commit

History

Repository files navigation

s3-streaming: handling (big) S3 files like regular files

Install

Streaming S3 objects like regular files

The basics

Injecting deserialization and compression handling in stream

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages