__ _ __ __
/ /_ _____(_)___/ /___ ____ ____/ /____
/ __ \/ ___/ / __ / __ `/ _ \______/ __ / ___/
/ /_/ / / / / /_/ / /_/ / __/_____/ /_/ (__ )
/_.___/_/ /_/\__,_/\__, /\___/ \__,_/____/
/____/
bridge-ds is a lightweight Python framework designed to provide a unified interface to deep learning datasets from different modalities: Perform global operations, aggregations and queries with a Pandas-like experience, and handle individual samples and raw data using a class-based, tab-completion-ey interface.
Browse
Browse through your datasets with ease using an intuitive interface.
Work with tables
View your data as tables.
Plot your data
Visualize your data quickly and effectively with the exposed Pandas Plotting API.
Assign, sort and filter
Perform common data operations like assigning new columns, sorting, and filtering with Pandas-like syntax.
Augment
Apply and visualize data augmentations directly within your workflow.
You can install the latest version of Bridge's from PyPI. It comes in a few flavors:
Core: The core package includes the basic functionality of Bridge.
$ pip install bridge-ds
Vision: The vision package includes the core package and additional (opinionated) functionality for working with image datasets.
$ pip install bridge-ds[vision]
- NOTE: to run the demo notebooks locally, you'll need the
vision
package.
To learn more about bridge-ds, please visit the official documentation.
$ git clone https://github.com/guybuk/bridge-ds.git
$ cd bridge-ds
$ pip install -e ".[dev]"
# Testing
$ pytest tests/core
# Building the docs
$ sudo apt install pandoc
$ cd docs
$ make html
bridge-ds is under active development, currently in a pre-alpha stage.
The following is a rough roadmap of the planned features:
- Video Support
- DataIO for video
- DisplayEngine (video player)
- DatasetProviders (for popular video datasets)
- Transforms (clipping, sampling, augmentation)
- Text
- DatasetProviders
- DisplayEngine (adapt existing engine to work with classic text tasks: translation, Q&A, etc.)
- Core
- DualDatasets (for tasks with two main elements e.g. image-image, image-text,text-text)
- Stress testing (currently have no capacity to test huge datasets)