Skip to content
This repository has been archived by the owner on Jan 2, 2025. It is now read-only.

Commit

Permalink
📝 Update Dataset docs
Browse files Browse the repository at this point in the history
  • Loading branch information
falexwolf committed Nov 9, 2023
1 parent 5921955 commit 7397c7c
Showing 1 changed file with 8 additions and 3 deletions.
11 changes: 8 additions & 3 deletions lnschema_core/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -1866,8 +1866,10 @@ class Dataset(Registry, Data):
"""Datasets: collections of data batches.
Args:
data: `DataLike` A data object (`DataFrame`, `AnnData`) to store.
data: `DataLike` An array (`DataFrame`, `AnnData`), a directory, or a list of `File` objects.
name: `str` A name.
meta: `Optional[DataLike]` An array (`DataFrame`, `AnnData`) or a `File`
object that defines metadata for a directory of objects.
description: `Optional[str] = None` A description.
version: `Optional[str] = None` A version string.
is_new_version_of: `Optional[Dataset] = None` An old version of the dataset.
Expand All @@ -1882,13 +1884,11 @@ class Dataset(Registry, Data):
The `File` & `Dataset` registries both
- track data batches of arbitrary format & size
- can validate & link features (the measured dimensions in a data batch)
Typically,
- a file stores a single batch of data
- a dataset stores a collection of data batches
Examples:
Expand Down Expand Up @@ -1931,6 +1931,11 @@ class Dataset(Registry, Data):
>>> dataset = ln.Dataset([file1, file2], name="My dataset")
>>> dataset.save()
Create a dataset from a directory of objects:
>>> dataset = ln.Dataset("s3://my-bucket/my-images/", name="My dataset", meta="s3://my-bucket/meta.parquet")
>>> dataset.save()
Make a new version of a dataset:
>>> # a non-versioned dataset
Expand Down

0 comments on commit 7397c7c

Please sign in to comment.