Project structure for Data Exploration.
- Python >=3.5
- Cookiecutter Python package >= 1.4.0: This can be installed with pip by or conda depending on how you manage your Python packages:
$ pip install cookiecutter
or
$ conda config --add channels conda-forge
$ conda install cookiecutter
cookiecutter https://github.com/drivendata/cookiecutter-data-science
The directory structure of your new project looks like this:
├── LICENSE
├── README.md <- The top-level README for developers using this project.
├── data
│ ├── external <- Data from third party sources.
│ ├── interim <- Intermediate data that has been transformed.
│ ├── processed <- The final, canonical data sets for modeling.
│ └── raw <- The original, immutable data dump.
│
├── docs <- A default Sphinx project; see sphinx-doc.org for details
│
├── models <- Trained models, model predictions, or model summaries
│
├── notebooks <- Jupyter notebooks. Naming convention is a date (for ordering),
│ the creator's initials, and a short `_` delimited description, e.g.
│ `2020_06_01-initial-data-exploration`.
│
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
│ └── figures <- Generated graphics and figures to be used in reporting
│
├── src <- Source code for use in this project.
│ ├── __init__.py <- Makes src a Python module
│ │
│ ├── version.py <- Source code version
│
└── setup.cfg <- Setup configuration
pip install -e .
python setup.py test
- Covering badge in pipeline