Datopy is a package for simplifying the early stages of the data analysis workflow (getting data, modeling data, validating data, etc). It is first and foremost a personal use package; however, I prioritize extensibility and clear documentation, and I hope that other developers will find the package useful.
While I make no guarantees in the way of performance or functionality, datopy is now in more-or-less working order (parts of it, at least). Feel free to explore, sample, and extend.
This release includes some routines for data modeling (see :mod:`datopy.modeling`), ETL (Extract, Transform, Load; :mod:`datopy.etl`), and data inspection (:mod:`datopy.inspection`, :mod:`datopy.stylesheet`). Still to come: the :mod:`datopy.models` subpackage, which will include data models, validation, and processing routines for dealing with media metadata (:mod:`datopy.models.media`), animal data (:mod:`datopy.models.eco`), and global development indicators (:mod:`datopy.models.global`).
Here's a snapshot of what this release includes:
- Core data modeling/validation functionality and various workflow-related utilities
- Extensive type checking and doctesting
- Continual performance and coverage testing via tox and Github actions
- Tested in Python 3.10 and Python 3.11 environments
- Improved type checking and examples #5
- Improved doctesting #6
- Improved environment management #7
- Better orchestration and unittesting suite #9
- Data validation schemes for retrieval and processing #14
- Generic Pydantic media model #28
Full Changelog: https://github.com/bainmatt/datopy/commits/v0.0.1