Skip to content

Commit

Permalink
doc touches (#14)
Browse files Browse the repository at this point in the history
* moved pip installs from notebooks to requirements.txt

* added pip installs to the workflow

* quickfix the workflow
  • Loading branch information
guybuk authored Sep 3, 2024
1 parent 14ff81e commit 08698cc
Show file tree
Hide file tree
Showing 13 changed files with 249 additions and 353 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,7 @@ jobs:
fail-fast: false
matrix:
python-version: ["3.10"]
dependencies: [ { "extras": "[dev]", "test_dir": "core" },{ "extras": "[dev,vision]", "test_dir": "vision" } ]
# dependencies: [ { "extras": "[dev,vision]", "test_dir": "vision" } ]
dependencies: [ { "extras": "[dev]", "test_dir": "core", "additional_packages": ""},{ "extras": "[dev,vision]", "test_dir": "vision" , "additional_packages": "pip install pycocotools torch"} ]

steps:
- name: Free Disk Space (Ubuntu)
Expand All @@ -44,6 +43,7 @@ jobs:
run: |
python -m pip install --upgrade pip
pip install ".${{matrix.dependencies.extras}}"
${{matrix.dependencies.additional_packages}}
# - name: pre-commit
# run: |
# pre-commit install
Expand Down
82 changes: 50 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,68 +34,86 @@ class-based, tab-completion-ey interface.

**Browse**

Browse through your datasets with ease using an intuitive interface.

![Browse Datasets](docs/gifs/browse.gif)

**Work with tables**

View your data as tables.

![Table Interface](docs/gifs/tables.gif)

**Plot your data**

Visualize your data quickly and effectively with the exposed Pandas Plotting API.

![Plotting](docs/gifs/plot.gif)

**Assign, sort and filter**

Perform common data operations like assigning new columns, sorting, and filtering with Pandas-like syntax.

![Table Operations](docs/gifs/do_stuff.gif)

**Augment**

![Transforms](docs/gifs/transform.gif)
Apply and visualize data augmentations directly within your workflow.

* **Explore data in your notebook:**

* Browse your data directly in your notebook, without
intermediary web-apps.
* **Dataset as a table:**

* Give your deep learning dataset a DataFrame
interface; making
cumbersome operations such as selections, sorting and
aggregations - easy.
* **Agnostic to Deep Learning Engines:**
* Convert into a training-ready dataset
in your DL framework of choice.
* **Transform and Debug:**
* Maintain full visibility into your
preprocessing/augmentation pipeline. See exactly which
inputs enter your model.
* **Work with Arbitrary Sources**:
* Work with remote and local data together, seamlessly.
* **Keep your data to yourself**:
* No need to upload your data to third parties.
![Transforms](docs/gifs/transform.gif)

# Installation

You can install the latest version of Bridge's from PyPI. It comes in a few flavors:

*Core*: The core package includes the basic functionality of Bridge.

```bash
```console
$ pip install bridge-ds
```
*Vision*: The vision package includes the core package and additional functionality for working with image datasets.

*Vision*: The vision package includes the core package and additional (opinionated) functionality for working with image datasets.

```bash
```console
$ pip install bridge-ds[vision]
```
*Dev*: The dev package includes the core package and additional tools for development.

```bash
$ pip install bridge-ds[dev]
```
* _NOTE_: to run the demo notebooks locally, you'll need the `vision` package.

* _NOTE_: to run the notebooks you'll need both the `vision` and `dev` packages.
# Documentation

To learn more about bridge-ds, please visit the [official documentation](https://bridge-ds.readthedocs.io/).
To learn more about bridge-ds, please visit the [official documentation](https://bridge-ds.readthedocs.io/).

# Development

## Setup
```console
$ git clone https://github.com/guybuk/bridge-ds.git
$ cd bridge-ds
$ pip install -e ".[dev]"

# Testing
$ pytest tests/core

# Building the docs
$ sudo apt install pandoc
$ cd docs
$ make html
```

## Roadmap

bridge-ds is under active development, currently in a pre-alpha stage.

The following is a rough roadmap of the planned features:

- Video Support
- [ ] DataIO for video
- [ ] DisplayEngine (video player)
- [ ] DatasetProviders (for popular video datasets)
- [ ] Transforms (clipping, sampling, augmentation)
- Text
- [ ] DatasetProviders
- [ ] DisplayEngine (adapt existing engine to work with classic text tasks: translation, Q&A, etc.)
- Core
- [ ] DualDatasets (for tasks with two main elements e.g. image-image, image-text,text-text)
- [ ] Stress testing (currently have no capacity to test huge datasets)
4 changes: 3 additions & 1 deletion docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
bridge-ds[dev,vision]
bridge-ds[dev,vision]
pycocotools
torch
8 changes: 1 addition & 7 deletions docs/source/getting_started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,18 +11,12 @@ You can install the latest version of Bridge's from PyPI. It comes in a few flav
$ pip install bridge-ds
*Vision*: The vision package includes the core package and additional functionality for working with image datasets.
*Vision*: The vision package includes the core package and additional (opinionated) functionality for working with image datasets.

.. code-block:: console
$ pip install bridge-ds[vision]
*Dev*: The dev package includes the core package and additional tools for development.

.. code-block:: console
$ pip install bridge-ds[dev]
Key Concepts
------------
Expand Down
56 changes: 22 additions & 34 deletions docs/source/notebooks/vision/custom_data/dataset_provider.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -5,24 +5,12 @@
"id": "0",
"metadata": {},
"source": [
"# Preliminaries\n",
"## Installation\n",
"To be able to run this tutorial, please install the following libraries:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1",
"metadata": {},
"outputs": [],
"source": [
"!pip install bridge-ds"
"# Preliminaries"
]
},
{
"cell_type": "markdown",
"id": "2",
"id": "1",
"metadata": {},
"source": [
"## Downloading the demo dataset\n",
Expand All @@ -32,7 +20,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "3",
"id": "2",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -45,7 +33,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "4",
"id": "3",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -58,7 +46,7 @@
},
{
"cell_type": "markdown",
"id": "5",
"id": "4",
"metadata": {},
"source": [
"### File Tree\n",
Expand Down Expand Up @@ -91,7 +79,7 @@
},
{
"cell_type": "markdown",
"id": "6",
"id": "5",
"metadata": {},
"source": [
"# DatasetProvider\n",
Expand Down Expand Up @@ -127,7 +115,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "7",
"id": "6",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -160,7 +148,7 @@
},
{
"cell_type": "markdown",
"id": "8",
"id": "7",
"metadata": {},
"source": [
"Now we can instantiate this provider and verify that it points to the right directory:"
Expand All @@ -169,7 +157,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "9",
"id": "8",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -180,7 +168,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "10",
"id": "9",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -189,7 +177,7 @@
},
{
"cell_type": "markdown",
"id": "11",
"id": "10",
"metadata": {},
"source": [
"The next step will be to implement `build_dataset()`, which will load the relevant metadata from this directory into a Bridge Dataset.\n",
Expand All @@ -200,7 +188,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "12",
"id": "11",
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -262,7 +250,7 @@
{
"attachments": {},
"cell_type": "markdown",
"id": "13",
"id": "12",
"metadata": {},
"source": [
"There's quite a bit of code here, so let's break it down a little:\n",
Expand Down Expand Up @@ -316,7 +304,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "14",
"id": "13",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -331,7 +319,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "15",
"id": "14",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -341,7 +329,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "16",
"id": "15",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -351,7 +339,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "17",
"id": "16",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -361,7 +349,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "18",
"id": "17",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -372,7 +360,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "19",
"id": "18",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -381,7 +369,7 @@
},
{
"cell_type": "markdown",
"id": "20",
"id": "19",
"metadata": {},
"source": [
"We have an operational Bridge Dataset, which we can manipulate as we see fit. \n",
Expand All @@ -394,7 +382,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "21",
"id": "20",
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -405,7 +393,7 @@
},
{
"cell_type": "markdown",
"id": "22",
"id": "21",
"metadata": {},
"source": [
"## In Summary\n",
Expand Down
Loading

0 comments on commit 08698cc

Please sign in to comment.