Skip to content
This repository has been archived by the owner on Jan 18, 2024. It is now read-only.

Commit

Permalink
Merge pull request #4 from ethho/dev-v0.1.0
Browse files Browse the repository at this point in the history
PLAT-180: Release v0.1.0
  • Loading branch information
ethho authored Jan 18, 2024
2 parents 24969e4 + 467ac0b commit a574e2b
Show file tree
Hide file tree
Showing 79 changed files with 5,144 additions and 448 deletions.
32 changes: 32 additions & 0 deletions .github/workflows/docs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
name: Build GitHub Pages Documentation

on:
push:
branches:
- main
paths:
- "docs/**.md"
- "mkdocs.yml"
workflow_dispatch:

jobs:
docs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
with:
fetch-depth: 0
- uses: actions/setup-python@v2
with:
python-version: 3.11
- name: Install Python dependencies
run: |
pipx install poetry
poetry config virtualenvs.create false
poetry install --with docs --sync
- name: Setup git
run: |
git config --global user.name 'github-actions[bot]'
git config --global user.email 'github-actions[bot]@users.noreply.github.com'
- name: Publish docs
run: poetry run mkdocs gh-deploy
46 changes: 46 additions & 0 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
name: Test

on:
push:
branches:
- main
paths-ignore:
- '**.md'
- 'docs/**'
- 'docsrc/**'

pull_request:
branches:
- main
paths-ignore:
- '**.md'
- 'docs/**'
- 'docsrc/**'

jobs:
test:
name: Run unit tests
runs-on: ubuntu-latest
strategy:
matrix:
python-version:
- "3.7"
- "3.8"
- "3.9"
- "3.10"
- "3.11"
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install Python dependencies
run: |
pipx install poetry
poetry install --with 'dev,docs' --sync
- name: Run all pytests
run: poetry run pytest --cov-report term-missing --cov=datajoint_file_validator tests
- name: Check formatting with black
run: |
poetry run black --check datajoint_file_validator tests
35 changes: 0 additions & 35 deletions .github/workflows/tox.yaml

This file was deleted.

1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -150,3 +150,4 @@ cython_debug/
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
!**/parts
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
## Release Notes

<!-- ### Upcoming -->

### 0.1.1 -- March 1, 2024
<!-- - d -->
41 changes: 28 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,28 @@
# DataJoint File Validator

<p align="center">
<a href="https://github.com/ethho/datajoint-file-validator/actions/workflows/test.yaml" target="_blank">
<img src="https://github.com/ethho/datajoint-file-validator/actions/workflows/test.yaml/badge.svg" alt="Test">
</a>
<!-- <a href="https://github.com/ethho/datajoint-file-validator/actions?query=workflow%3APyPi" target="_blank">
<img src="https://github.com/ethho/datajoint-file-validator/workflows/PyPi/badge.svg" alt="Publish">
</a> -->
<!-- <a href="https://coverage-badge.samuelcolvin.workers.dev/redirect/ethho/datajoint-file-validator" target="_blank">
<img src="https://coverage-badge.samuelcolvin.workers.dev/ethho/datajoint-file-validator.svg" alt="Coverage">
</a> -->
<!-- <a href="https://pypi.org/project/datajoint-file-validator" target="_blank">
<img src="https://img.shields.io/pypi/v/datajoint-file-validator?color=%2334D058&label=pypi%20package" alt="Package version">
</a> -->
</p>

This repository contains a Python package that validates file sets for DataJoint pipelines.

## Installation

### Install Locally

```bash
pip install datajoint_file_validator@git+https://github.com/ethho/datajoint-file-validator.git
pip install git+https://github.com/ethho/datajoint-file-validator.git
```

### Dev Container
Expand All @@ -17,7 +32,7 @@ This repository includes a [devcontainer](https://code.visualstudio.com/docs/dev
1. Install the [Remote - Containers](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) extension in VS Code and open the repository in a container.
2. Open the devcontainer in GitHub Codespaces:

![Launch devcontainer in GitHub Codespace instance](docs/img/codespace_launch.png)
![Launch devcontainer in GitHub Codespace instance](docs/images/codespace_launch.png)

## Quick Start

Expand All @@ -27,18 +42,18 @@ Validate a fileset against an existing manifest:
from datajoint_file_validator import validate

my_dataset_path = 'tests/data/filesets/fileset0'
manifest_path = 'datajoint_file_validator/manifests/demo_dlc_v0.1.yaml'
success, report = validate(my_dataset_path, manifest_path, verbose=True, format='plain')
manifest_path = 'datajoint_file_validator/manifests/demo_dlc/v0.1.yaml'
success, report = validate(my_dataset_path, manifest_path, verbose=True, format='json')
# Validation failed with the following errors:
# [
# {
# 'rule': 'Min total files',
# 'rule_description': 'Check that there are at least 6 files anywhere in the fileset',
# 'constraint_id': 'count_min',
# 'constraint_value': 6,
# 'errors': 'constraint `count_min` failed: 4 < 6'
# }
# ]
# {
# "rule": "Min total files",
# "rule_description": "Check that there are at least 6 files anywhere in the fileset",
# "constraint_id": "count_min",
# "constraint_value": 6,
# "errors": "constraint `count_min` failed: 4 < 6"
# }
#]

print(success)
# False
Expand All @@ -47,7 +62,7 @@ print(success)
Alternatively, validate using the included command line interface:

```console
$ datajoint-file-validator validate tests/data/filesets/fileset0 datajoint_file_validator/manifests/demo_dlc_v0.1.yaml
$ datajoint-file-validator validate tests/data/filesets/fileset0 datajoint_file_validator/manifests/demo_dlc/v0.1.yaml
❌ Validation failed with 1 errors!
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ ┃ Rule ┃ ┃ Constraint ┃ ┃
Expand Down
2 changes: 2 additions & 0 deletions datajoint_file_validator/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,5 @@
from .manifest import Manifest
from .result import ValidationResult
from .main import validate_snapshot, validate
from .log import logger
from .registry import find_manifest, list_manifests
148 changes: 148 additions & 0 deletions datajoint_file_validator/base_settings.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
import os
from typing import Optional, Dict, Any, Union, get_type_hints, get_args, get_origin
from dotenv import dotenv_values


class BaseSettings:
"""Settings class for an application. Mimics Pydantic's BaseSettings."""

ENV_PATH = ".env"

# Define settings attributes here
# my_config_val: str = "default value"
# my_optional_attr: Optional[str] = None

# Setting MY_FLAG=1 in .env will set this to True
# my_flag: bool = False

# Setting env var MY_CASTED_ATTR='4' will try to cast this as int first, then str
# my_casted_attr: Union[int, str] = 2

@staticmethod
def _cast_val(val: str, type_annot: Optional[Any]) -> Any:
"""
Cast a string `val` to the type of `type_annot`.
"""
if val is None:
return None
if type_annot is None:
return val
if type_annot is bool:
if str(val).lower() in ["true", "1"]:
return True
elif str(val).lower() in ["false", "0"]:
return False
else:
raise ValueError(f"Failed to parse '{val}' as bool.")

# Handle generic types
if get_origin(type_annot) is Union: # includes Optional
for constr in get_args(type_annot):
try:
return constr(val)
except (TypeError, ValueError):
continue
elif get_origin(type_annot) is not None:
raise TypeError(
f"Cannot parse '{val}' as instance of '{type_annot.__name__}'."
)
else:
constr = type_annot

# Cast to type
try:
return constr(val)
except (TypeError, ValueError) as e:
raise TypeError(
f"Failed to parse '{val}' as instance of '{type_annot.__name__}'."
) from e

def _populate_from_dot_env(self, env_path: str):
"""
Set attributes from a .env file at `env_path`.
"""
d = dotenv_values(env_path)
self._populate_from_dict(d, match_upper=True)

def _populate_from_env_vars(self):
"""
Set attributes from environment variables.
"""
d = os.environ
self._populate_from_dict(d, match_upper=True)

def _populate_from_dict(self, d: Dict[str, Any], match_upper: bool = False):
"""
Set attributes from a dictionary `d`.
Skips setting attributes that are upper-cased, start with an underscore,
are callable, have no type annotation, or are not class attributes.
"""
attrs = {
**get_type_hints(self),
# Include attribute names that have no type annotation but a default value
**self.__class__.__dict__,
}
for k in attrs:
key_in_d = k.upper() if match_upper else k
if (
k.upper() == k
or k.startswith("_")
or callable(getattr(self, k, None))
or key_in_d not in d
):
continue
val = d[key_in_d]

type_annot = get_type_hints(self).get(k)
try:
setattr(self, k, self._cast_val(val, type_annot))
except (TypeError, ValueError) as e:
raise ValueError(
f"Error parsing {key_in_d}={val} as {type_annot}: {e}"
) from e

def __init__(self, env_path: Optional[str] = None, **values):
"""
Create a new settings object, which allows settings to imported from
environment variables, .env files, and keyword arguments, and accessed
as attributes. Attributes will be set in the following order:
1. From default values set in the class definition.
2. From a .env file at `env_path` (default: `.env`)
3. From environment variables. Environment variables are matched to
attributes by upper-casing the attribute name. e.g. to set the
attribute `my_attr`, set the environment variable `MY_ATTR`.
4. From keyword arguments passed as `**values` to this constructor.
The successive step will overwrite any previously set attributes.
Attribute names that are upper-cased in the class definition, start
with an underscore, are callable, have no type annotation, or are not
class attributes will be ignored.
Parameters
----------
env_path : str
Path to a .env file. Will be set to `.env` by default.
**values : Any
Keyword arguments to set as attributes.
"""
if not hasattr(self, "__annotations__"):
self.__annotations__ = {}

self._populate_from_dict(self.__class__.__dict__) # From default values
env_path = env_path or self.ENV_PATH
if os.path.isfile(env_path):
self._populate_from_dot_env(env_path)
self._populate_from_env_vars()
if values:
self._populate_from_dict(values)

# Check that all attributes have been set
unset_attrs = []
for k in get_type_hints(self):
if k.upper() == k or k.startswith("_"):
continue
if not hasattr(self, k):
unset_attrs.append(k)
if unset_attrs:
raise ValueError(f"Missing values for attributes: {unset_attrs}")
Loading

0 comments on commit a574e2b

Please sign in to comment.