Provide HTTP server to retain and organize files.
Project originally created to provide file storage to ephemeral processes (ie containers).
Document provides a quickstart for application deployment and usage. Project documentation and commentary is retained within additional READMEs. To name a few:
Container
# docker config
image=http-bucket
container=bucket
exposed_port=80
local_storage=/tmp/bucket # host path
# application config
export ARCHIVE_DIR=/tmp/bucket # container path; consumed by bucket.py
export FLASK_ENV=development # load code changes without restarting server
# build image
docker build --rm -t "${image}" .
# start in foreground
# create volume for upload/archive directory
# create volume for local python source code
docker run --rm \
-e FLASK_ENV -e ARCHIVE_DIR \
-p ${exposed_port}:80 \
--volume ${local_storage}:${ARCHIVE_DIR} \
--volume $(pwd)/assets/src/:/app \
--name ${container} ${image}
Host
# (optional) setup virtual environment
virtualenv -p python3 ~/venv/bucket
. ~/venv/bucket/bin/activate
# satisfy dependencies
pip install --user -r ./assets/requirements.txt
# application config
export ARCHIVE_DIR=/tmp/bucket # disk storage; consumed by bucket.py
export FLASK_ENV=development # load codechanges without restarting server
# start flask application
python3 ./assets/src/api.py
Examples provided here present simplified upload instructions. Application can be configured to organize file system content by specifying tags during service deployment and providing matching headers during file upload. Additional instructions are captured within Config section.
/upload
supports:
GET
: HTML form which prompts user for uploadPOST
: Send file and filename viamultipart/form-data
Additional data can be conveyed to POST
operations via headers. Attributes dictate how content is retained (and replicated) server side. This config is covered within Replication section.
server=example.org
# POST using multipart/form-data
# upload individual file
curl -X POST -F '[email protected]' ${server}/upload
# upload directory structure
curl -X POST -F '[email protected]' ${server}/upload
curl -X POST -F '[email protected]' ${server}/upload
server=example.org
# retreive system config
curl ${server}/config
# check if content previously uploaded
artifact=results.xml
curl --head ${server}/checksum/$(md5sum ${artifact} | awk '{print $1}')
Upload content can be organized and retained in multiple locations via replication. This functionality requires:
- (server)
RECPLICATE_0
[1][2] environment variable set at time of deployment. Value is a relative directory path and must include${HEADER-NAME}
notation. - (client) Matching request headers [3] sent when calling
/upload
The combination of server config and client headers allows producers to organize content in directory names of their choosing. This mechansism allows isolated processes to retain content in a common location. Behind the scenes, content is linked against blob and archive storage to limit disk usage.
Notes:
- Multiple environment variables are supported:
REPLICATE_0
throughREPLICATE_9
REPLICATE_0
example:user/${USERNAME}/${TOPIC}
- Header example:
curl -H "Username: bob" -H "Topic: demo" -X POST ...
will translate to a replica atuser/bob/demo/<content>
Environment variables can control application behaviors
Variable | Default | Description | Notes |
---|---|---|---|
ARCHIVE_DIR |
/tmp/bucket/archive | Local storage path | When deploying via docker, path should be a mounted volume |
ARCHIVE_URI |
None | External location artifacts can be retrieved (ie web server, NFS path) | Path is leveraged within /upload API responses |
CHECKSUM_TYPE |
md5 | Hashing algorithm used to calculate file checksum | Supported dictated by hashlib |
MAX_CONTENT_LENGTH |
32mb | Max file size supported by /upload endpoint |
<int><unit> and <bytes> formatted supported |
REPLICATE_0 |
None | See (Replication)[#Replication] | Multiple environment variables supported, 0 through 10 |
Flask
Mime Types
Headers