Skip to content

Commit

Permalink
allow for private dataset creation (#403)
Browse files Browse the repository at this point in the history
  • Loading branch information
jean-lucas authored Nov 1, 2023
1 parent 3b0abe7 commit 317d2bf
Show file tree
Hide file tree
Showing 4 changed files with 15 additions and 1 deletion.
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,14 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).


## [0.16.6](https://github.com/scaleapi/nucleus-python-client/releases/tag/v0.16.6) - 2023-11-01

### Added
- Allow datasets to be created in "privacy mode". For example, `client.create_dataset('name', use_privacy_mode=True)`.
- Privacy Mode lets customers use Nucleus without sensitive raw data ever leaving their servers.
- When set to `True`, you can submit URLs to Nucleus that link to raw data assets like images or point clouds, instead of transferring that data to Scale. Access control is then completely in the hands of users: URLs may optionally be protected behind your corporate VPN or an IP whitelist. When you load a Nucleus web page, your browser will directly fetch the raw data from your servers without it ever being accessible to Scale.


## [0.16.5](https://github.com/scaleapi/nucleus-python-client/releases/tag/v0.16.5) - 2023-10-30

### Added
Expand Down
5 changes: 5 additions & 0 deletions nucleus/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@
AUTOTAGS_KEY,
DATASET_ID_KEY,
DATASET_IS_SCENE_KEY,
DATASET_PRIVACY_MODE_KEY,
DEFAULT_NETWORK_TIMEOUT_SEC,
EMBEDDING_DIMENSION_KEY,
EMBEDDINGS_URL_KEY,
Expand Down Expand Up @@ -429,6 +430,7 @@ def create_dataset(
self,
name: str,
is_scene: Optional[bool] = None,
use_privacy_mode: bool = False,
item_metadata_schema: Optional[Dict] = None,
annotation_metadata_schema: Optional[Dict] = None,
) -> Dataset:
Expand All @@ -443,6 +445,8 @@ def create_dataset(
is_scene: Whether the dataset contains strictly :class:`scenes
<LidarScene>` or :class:`items <DatasetItem>`. This value is immutable.
Default is False (dataset of items).
use_privacy_mode: Whether the images of this dataset should be uploaded to Scale. If set to True,
customer will have to adjust their file access policy with Scale.
item_metadata_schema: Dict defining item-level metadata schema. See below.
annotation_metadata_schema: Dict defining annotation-level metadata schema.
Expand Down Expand Up @@ -473,6 +477,7 @@ def create_dataset(
{
NAME_KEY: name,
DATASET_IS_SCENE_KEY: is_scene,
DATASET_PRIVACY_MODE_KEY: use_privacy_mode,
ANNOTATION_METADATA_SCHEMA_KEY: annotation_metadata_schema,
ITEM_METADATA_SCHEMA_KEY: item_metadata_schema,
},
Expand Down
1 change: 1 addition & 0 deletions nucleus/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@
DATASET_LENGTH_KEY = "length"
DATASET_MODEL_RUNS_KEY = "model_run_ids"
DATASET_NAME_KEY = "name"
DATASET_PRIVACY_MODE_KEY = "use_privacy_mode"
DATASET_SLICES_KEY = "slice_ids"
DEFAULT_ANNOTATION_UPDATE_MODE = False
DEFAULT_NETWORK_TIMEOUT_SEC = 120
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ ignore = ["E501", "E741", "E731", "F401"] # Easy ignore for getting it running

[tool.poetry]
name = "scale-nucleus"
version = "0.16.5"
version = "0.16.6"
description = "The official Python client library for Nucleus, the Data Platform for AI"
license = "MIT"
authors = ["Scale AI Nucleus Team <[email protected]>"]
Expand Down

0 comments on commit 317d2bf

Please sign in to comment.