Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Switch to version 1.0 of config file format, fix #685 #345 #748 #750

Merged
merged 183 commits into from
May 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
183 commits
Select commit Hold shift + click to select a range
476b9da
WIP: Add src/vak/config/dataset.py
NickleDave May 1, 2024
a625bb7
Add module-level docstring + type annotations in src/vak/config/parse.py
NickleDave May 1, 2024
272f67d
WIP: Fix how cli.prep adds dataset path to toml config file
NickleDave May 1, 2024
fa35def
Change table names in src/vak/config/valid.toml
NickleDave May 1, 2024
988c150
Rename section -> table in config/parse.py
NickleDave May 1, 2024
72e11e8
In cli/prep change 'section' -> 'table' and lowercase table names
NickleDave May 1, 2024
7181f71
In config/config.py, change 'section' -> 'table' and lowercase table …
NickleDave May 1, 2024
9d4467c
Change '[PREP]' -> '[vak.prep]' in config/prep.py
NickleDave May 1, 2024
b5a6413
WIP: Change table names in config files in tests/data_for_tests/configs
NickleDave May 1, 2024
27f0596
Make tomlkit a dependency in pyproject.toml, drop toml
NickleDave May 1, 2024
62fe4f4
Change config/parse.py to use tomlkit
NickleDave May 1, 2024
3253a4e
Update example configs in doc/toml/
NickleDave May 1, 2024
fb70733
Add link to example config files in docs, in error messages in config…
NickleDave May 1, 2024
b653138
Remove 'spect_params' from REQUIRED_OPTIONS in config/parse.py, this …
NickleDave May 1, 2024
83eddf0
Rename 'config_toml' -> 'config_dict' in config/parse.py
NickleDave May 1, 2024
1160a6b
Fix function _validate_tables_arg_convert_list in config/parse.py
NickleDave May 1, 2024
9796624
Fix error message formatting in src/vak/config/validators.py
NickleDave May 1, 2024
60f7b02
Add ModelConfig class to config/model.py, add type annotations, fix c…
NickleDave May 1, 2024
910c5df
Fixup fixing config_from_toml_dict to look in specific section
NickleDave May 1, 2024
32995bf
Rewrite config/eval.py with 'modern' attrs
NickleDave May 1, 2024
f775749
Fixup rewrite config/eval with 'modern attrs
NickleDave May 1, 2024
060abaa
Rewrite config/learncurve.py with 'modern' attrs
NickleDave May 1, 2024
71e89bc
Rewrite config/predict.py with 'modern' attrs
NickleDave May 1, 2024
f6cf945
Rewrite config/prep.py with 'modern' attrs
NickleDave May 1, 2024
9798707
Rewrite config/train.py with 'modern' attrs
NickleDave May 1, 2024
9c0d301
Rename Dataset -> DatasetConfig in config/dataset.py
NickleDave May 1, 2024
0de12da
Add are_table_options_valid to config/validators.py, will be used by …
NickleDave May 1, 2024
2c272bc
WIP: Add from_config_dict classmethod to EvalConfig
NickleDave May 1, 2024
53447f5
WIP: Add tests/test_config/test_dataset.py
NickleDave May 1, 2024
a6cb674
Make fixes to ModelConfig class, fix circular imports in config/model…
NickleDave May 1, 2024
a039dd9
Write tests in tests/test_config/test_dataset.py
NickleDave May 1, 2024
d6588f0
Use tomlkit not toml in cli/prep.py
NickleDave May 2, 2024
53ef623
Use tomlkit in tests/fixtures/annot.py
NickleDave May 2, 2024
1f80c19
Use tomlkit in tests/scripts/vaktestdata/configs.py
NickleDave May 2, 2024
cd2d0dd
Use tomlkit in tests/scripts/vaktestdata/source_files.py
NickleDave May 2, 2024
ad1bbf3
Use tomlkit in tests/test_config/test_validators.py
NickleDave May 2, 2024
249cf16
Remove spect_params attribute from Config in config/config.py, fix cl…
NickleDave May 2, 2024
6654a06
Reorder attributes, fix typo in docstring of DatasetConfig
NickleDave May 2, 2024
b2c8652
Rewrite config/parse.py assuming config classes have from_config_dict…
NickleDave May 2, 2024
0680481
Rename `table` -> `table_name` in a couple validators in config/valid…
NickleDave May 2, 2024
9cbf58a
Remove use of config.model.config_from_toml_path in cli/eval.py
NickleDave May 2, 2024
eeef546
Remove use of config.model.config_from_toml_path in cli/learncurve.py
NickleDave May 2, 2024
0a5eb96
Remove use of config.model.config_from_toml_path in cli/predict.py
NickleDave May 2, 2024
f8ebc20
Remove use of config.model.config_from_toml_path in cli/train.py
NickleDave May 2, 2024
cebd8d8
Remove functions from config/model.py: config_from_toml_path and conf…
NickleDave May 2, 2024
7469f8d
Add `to_dict` method to ModelConfig
NickleDave May 2, 2024
c0c11e4
Use to_dict() method of ModelConfig class in cli functions
NickleDave May 2, 2024
e87b5ed
Fix how we get labelset from config in tests/fixtures/annot.py
NickleDave May 2, 2024
07dd40e
WIP: Clean up / rewrite tests/fixtures/config.py
NickleDave May 2, 2024
22aa552
Fix model tables in tests/data_for_tests/configs
NickleDave May 2, 2024
10823a5
Finish unit tests in tests/test_config/test_model.py
NickleDave May 2, 2024
2216f4b
Fix model tables in doc/toml
NickleDave May 2, 2024
4c38198
Rename data_for_tests/configs/invalid_option_config.toml -> invalid_k…
NickleDave May 2, 2024
9f8b222
Rename are_options_valid/are_table_options_valid -> are_keys_valid/ar…
NickleDave May 2, 2024
cdf7495
Rename two fixtures in fixtures/config.py: invalid_section_config_pat…
NickleDave May 2, 2024
90f401f
Fix validator names in config/parse.py, rename TABLE_CLASSES constant…
NickleDave May 2, 2024
67ec4a9
Rename config/valid.toml -> valid-version-1.0.toml, fix how model tab…
NickleDave May 2, 2024
7288933
Fix VALID_TOML_PATH in config/validators.py after renaming config/val…
NickleDave May 2, 2024
4f72b0c
Import config classes in vak/config/__init__.py
NickleDave May 2, 2024
bd0a055
Add _tomlkit_to_popo to tests/fixtures/config.py so we operate on dic…
NickleDave May 2, 2024
e359376
Add _tomlkit_to_popo to config/parse.py so we operate on dicts not to…
NickleDave May 2, 2024
0ff5860
Finish rewriting tests for tests/test_config/test_prep.py
NickleDave May 2, 2024
3e5f1d2
Rewrite EvalConfig with from_config_dict method
NickleDave May 2, 2024
aeee2b6
Rewrite LearncurveConfig with from_config_dict method
NickleDave May 2, 2024
6ab3a5f
Rewrite PredictConfig with from_config_dict method
NickleDave May 2, 2024
4b4fedf
Rewrite PrepConfig with from_config_dict method
NickleDave May 2, 2024
77e38e6
Rewrite TrainConfig with from_config_dict method
NickleDave May 2, 2024
f3e37ae
Remove functions from config/parse.py
NickleDave May 2, 2024
2b598d0
Rename config/parse.py -> config/load.py
NickleDave May 2, 2024
d175436
Make functions in config/parse.py into classmethods on Config class
NickleDave May 2, 2024
323c238
Use config.Config.from_toml_path everywhere instead of config.parse.f…
NickleDave May 2, 2024
500543e
Make fixes in Config classmethods
NickleDave May 3, 2024
7ebd92e
Change load._load_toml_from_path again so that it returns config_dict…
NickleDave May 3, 2024
acf4cc3
Add docstring to are_tables_valid in config/validators.py
NickleDave May 3, 2024
42e92c7
Lowercase config table names in tests/scripts/vaktestdata/configs.py
NickleDave May 3, 2024
61ea4e4
In tests/scripts/vaktestdata/source_files.py, change cfg.spect_params…
NickleDave May 3, 2024
dc19398
in test_cli/test_prep.py, call vak.config.load not vak.config.parse
NickleDave May 3, 2024
95e424c
Fix how we instantiate DatasetConfig and ModelConfig in EvalConfig.fr…
NickleDave May 3, 2024
5481ec8
Fix how we instantiate DatasetConfig and ModelConfig in PredictConfig…
NickleDave May 3, 2024
1733166
Fix how we instantiate DatasetConfig and ModelConfig in TrainConfig.f…
NickleDave May 3, 2024
d729abd
Fix how we instantiate DatasetConfig and ModelConfig in LearncurveCon…
NickleDave May 3, 2024
9185eb9
Remove brekapoint in src/vak/config/model.py
NickleDave May 3, 2024
3341b08
Fix wrong variable name so we save configs correctly in tests/scripts…
NickleDave May 3, 2024
b050989
Fix how we re-write configs, in tests/scripts/vaktestdata/configs.py
NickleDave May 3, 2024
f6fdec6
Add model and dataset tables to get those keys in top-level tables, i…
NickleDave May 3, 2024
8d49aef
Change cfg.table.dataset_path -> cfg.table.dataset.path in vak/cli mo…
NickleDave May 3, 2024
2798282
Get tests passing for tests/test_config/test_eval.py
NickleDave May 3, 2024
0f6cc5a
Clean up tests/test_config/test_eval.py
NickleDave May 3, 2024
311287f
Get tests passing in tests/test_config/test_predict.py
NickleDave May 3, 2024
5c88bb0
Fix how we access config_toml in tests/scripts/vaktestdata/configs.py…
NickleDave May 3, 2024
5db3d2f
Add pytest.mark.parametrize to tests/test_config/test_learncurve.py
NickleDave May 3, 2024
3180d48
Rewrite tests in tests/test_config/test_train.py
NickleDave May 3, 2024
7c2a1f0
Rewrite tests in tests/test_config/test_config.py
NickleDave May 3, 2024
18b6dc2
Add unit test to tests/test_config/test_model.py
NickleDave May 3, 2024
a4c06d5
Add unit test for exceptions in tests/test_config/test_eval.py
NickleDave May 3, 2024
05032d6
Fix 'cfg.spect_params' -> 'cfg.prep.spect_params' in src/vak/cli/pred…
NickleDave May 3, 2024
d96d782
Add unit test for exceptions in tests/test_config/test_learncurve.py
NickleDave May 3, 2024
3927c5b
Add unit test for exceptions in tests/test_config/test_train.py
NickleDave May 3, 2024
1996df2
Add more test cases to TestEvalConfig.test_from_config_dict_raises
NickleDave May 3, 2024
30a31f8
Add more test cases to TestLearncurveConfig.test_from_config_dict_raises
NickleDave May 3, 2024
836e8ed
Add unit test for exceptions in tests/test_config/test_predict.py
NickleDave May 3, 2024
8e2cc9f
Add two unit tests that PrepConfig raises expected exceptions
NickleDave May 3, 2024
35c3732
Fix/add unit tests in tests/test_config/test_config.py
NickleDave May 3, 2024
e0abf23
Change order of parameters for Config.from_config_dict, make toml_pat…
NickleDave May 3, 2024
e585091
Fix/add unit tests in tests/fixtures/config.py
NickleDave May 3, 2024
ffd0a4e
Fix/add unit tests in tests/fixtures/config.py
NickleDave May 3, 2024
63ee0ee
Rename test_config/test_parse.py -> test_load.py, fix/rewrite tests
NickleDave May 3, 2024
f081359
Fix tests in tests/test_config/test_spect_params.py
NickleDave May 3, 2024
dcc98a8
Make fixups in tests/test_config
NickleDave May 3, 2024
82dd4f3
Apply fixes from linter
NickleDave May 3, 2024
a03b3b1
Make more linting fixes
NickleDave May 3, 2024
9b31067
Speed up install in nox session 'lint', only install linting tools
NickleDave May 3, 2024
ee9c4e5
Change names 'section'/'option' -> 'table'/'key' in tests
NickleDave May 3, 2024
9466d6c
Fix tests in tests/test_cli/test_eval.py
NickleDave May 3, 2024
0206455
Finish fixing cli tests, fix renaming
NickleDave May 3, 2024
2ed3763
Fix how we get 'path' from 'dataset' table in configs, in tests/fixtu…
NickleDave May 3, 2024
fca4ba9
Fix how we get 'path' from 'dataset' table in configs, in tests/fixtu…
NickleDave May 3, 2024
d71d773
Change .dataset_path -> .dataset.path in tests/
NickleDave May 3, 2024
4a5122d
Fix how we get model config and rename config attribute .dataset_path…
NickleDave May 3, 2024
b8b902c
In tests/, fixup change .dataset_path -> .dataset.path, use model.nam…
NickleDave May 3, 2024
631e403
Fix fixture specific_config_toml_path in fixtures/config.py to handle…
NickleDave May 3, 2024
f717e00
Fix how we change ['dataset']['path'] value in tests/test_eval/test_f…
NickleDave May 3, 2024
ac1a7e7
Fix how we change ['dataset']['path'] value in config in several tests
NickleDave May 3, 2024
e4ac360
Use ModelConfig attribute name where needed in tests/test_learncurve/…
NickleDave May 3, 2024
f542c47
In tests, replace calls to vak.config.model.config_from_toml_path wit…
NickleDave May 3, 2024
0ceadfe
Change cfg.spect_params -> cfg.prep.spect_params in tests
NickleDave May 3, 2024
315294a
Fix cfg.predict -> cfg.predict.dataset.path in tests/test_predict/tes…
NickleDave May 3, 2024
7ad30b0
Fix constant LABELSET_NOTMAT in fixtures/annot.py so it is a list of …
NickleDave May 3, 2024
937192d
Fix cfg.learncurve -> cfg.learncurve.dataset.path in tests/test_prep/…
NickleDave May 3, 2024
ef9cd86
Fix cfg.learncurve -> cfg.learncurve.dataset.path in tests/test_prep/…
NickleDave May 4, 2024
ee4a9a9
Cast pathlib to str before adding to tomldoc, in tests/test_train/
NickleDave May 4, 2024
881f4f8
Change transform/dataset params keys in data_for_tests/configs to a d…
NickleDave May 4, 2024
eda99eb
Add `params` attribute to DatasetConfig
NickleDave May 4, 2024
513046f
Change transform/dataset params keys in doc/toml/ to a dataset table …
NickleDave May 4, 2024
5317dff
Rewrite vak/config/model.py method 'to_dict' as 'asdict', using attrs…
NickleDave May 4, 2024
a1f0e0a
Add asdict method to DatasetConfig class, like ModelConfig.asdict
NickleDave May 4, 2024
bd0d050
Fix calls to model.to_dict() -> model.asdict()
NickleDave May 4, 2024
ba788c6
Add unit tests for DatasetConfig.asdict
NickleDave May 4, 2024
0738b77
Add unit tests for ModelConfig.asdict
NickleDave May 4, 2024
3df6a5c
Add an assertion in tests/test_config/test_dataset.py
NickleDave May 4, 2024
9e2dab9
Remove transform params and dataset_params from EvalConfig, will just…
NickleDave May 4, 2024
1b70d54
Remove dataset/transform_params key-value pairs in valid-version-1.0.…
NickleDave May 4, 2024
48830d4
Remove train/val/dataset/transform_params from TrainConfig, will use …
NickleDave May 4, 2024
ad05a98
Remove train/val/dataset/transform_params from PredictConfig, will us…
NickleDave May 4, 2024
7f93b74
Revise transforms.defaults.frame_classification.TrainItemTransform an…
NickleDave May 4, 2024
7152657
Make vak.table.dataset.params into an in-line table in toml files in …
NickleDave May 4, 2024
149d6eb
Fix attribute name in frame_classification.TrainItemTransform.__init_…
NickleDave May 4, 2024
21f7971
Rewrite datasets.frame_classification.WindowDataset to require item_t…
NickleDave May 4, 2024
be8e693
Rewrite datasets.frame_classification.FramesDataset to make item_tran…
NickleDave May 4, 2024
558bb14
Rewrite src/vak/train/frame_classification.py: remove params model_na…
NickleDave May 4, 2024
b54be05
Rewrite src/vak/train/_train.py: remove params model_name, train/val_…
NickleDave May 4, 2024
88f8b49
Rewrite vak/cli/train.py to call train._train.train with just model_c…
NickleDave May 4, 2024
a8721cc
Fix how we unpack batch in training_step method of FrameClassificatio…
NickleDave May 4, 2024
ad5d98b
Change transform_kwargs parameter of transforms.defaults.parametric_u…
NickleDave May 5, 2024
43c525f
Change transform_kwargs parameter of transforms.defaults.frame_classi…
NickleDave May 5, 2024
9e003fc
Change DatasetConfig.params attribute to default to empty dict, so we…
NickleDave May 5, 2024
0b5bb27
Fix DatasetConfig.from_config_dict method to not use dict.get method,…
NickleDave May 5, 2024
b7ca332
Modify transforms.defaults.get so that transform_kwargs is None by de…
NickleDave May 5, 2024
bebaa3f
Rewrite src/vak/train/parametric_umap.py to use model_config and data…
NickleDave May 5, 2024
8f6ef37
Rewrite vak/eval/frame_classification.py to use model_config and data…
NickleDave May 5, 2024
9768b5d
Rewrite vak/eval/parametric_umap.py to use model_config and dataset_c…
NickleDave May 5, 2024
519f7d4
Rewrite vak/eval/eval_.py to use model_config and dataset_config para…
NickleDave May 5, 2024
e68ad49
Rewrite cli.eval to pass model_config and dataset_config into eval_mo…
NickleDave May 5, 2024
dca625f
Unpack dataset_config[params] with ** inside trak/frame_classificatio…
NickleDave May 5, 2024
43d1a99
Rewrite vak/learncurve/frame_classification.py to use model_config an…
NickleDave May 5, 2024
af971a0
Rewrite vak/learncurve/learncurve.py to use model_config and dataset_…
NickleDave May 5, 2024
2257144
Rewrite cli.learncurve to pass model_config and dataset_config into l…
NickleDave May 5, 2024
cd60eab
Rewrite vak/predict/frame_classification.py to use model_config and d…
NickleDave May 5, 2024
571a9c6
Rewrite vak/predict/parametric_umap.py to use model_config and datase…
NickleDave May 5, 2024
aba3353
Rewrite vak/predict/predict.py to use model_config and dataset_config…
NickleDave May 5, 2024
3e237b4
Fix dataset_path -> dataset_config[path] and add missing variable mod…
NickleDave May 5, 2024
9eb7c0c
Fix dataset_path -> dataset_config[path] and add missing variable mod…
NickleDave May 5, 2024
32d9604
Rewrite vak/cli/predict.py to use model_config and dataset_config par…
NickleDave May 5, 2024
87e1a79
Remove non-existent dataset_params variable in vak/predict/frame_clas…
NickleDave May 5, 2024
bfb4a9b
Fix unit tests for DatasetConfig to test 'params' attribute gets hand…
NickleDave May 5, 2024
d70886a
Remove train/val_dataset_params and train/val_transform_params from t…
NickleDave May 5, 2024
401f1a3
Use DatasetConfig.params attribute where we need to in tests/test_dat…
NickleDave May 5, 2024
0ae33a8
Fix method name ModelConfig.to_dict -> asdict in tests/
NickleDave May 5, 2024
e033b25
In tests for eval/learncurve/predict/train, use model_config and data…
NickleDave May 5, 2024
9d7ec3e
Fix use of default transform and dataset.params attribute in test_mod…
NickleDave May 5, 2024
33f578b
Fix config snippets in docs
NickleDave May 5, 2024
84c316b
Apply linting to src/
NickleDave May 5, 2024
bc0b487
Raise 'from e' with errors in eval/predict/train/frame_classification…
NickleDave May 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 39 additions & 39 deletions doc/get_started/autoannotate.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,10 @@ Below is an example of some annotated Bengalese finch song, which is what we'll

:::{hint}
`vak` has built-in support for widely-used annotation formats.
Even if your data is not annotated with one of these formats,
you can use `vak` by converting your annotations to a simple `.csv` format
Even if your data is not annotated with one of these formats,
you can use `vak` by converting your annotations to a simple `.csv` format
that is easy to create with Python libraries like `pandas`.
For more information, please see:
For more information, please see:
{ref}`howto-user-annot`
:::

Expand All @@ -42,39 +42,39 @@ Before going through this tutorial, you'll need to:
or [notepad++](https://notepad-plus-plus.org/)
3. Download example data from this dataset: <https://figshare.com/articles/Bengalese_Finch_song_repository/4805749>

- one day of birdsong, for training data (click to download)
- one day of birdsong, for training data (click to download)
{download}`https://figshare.com/ndownloader/files/41668980`
- another day, to use to predict annotations (click to download)
{download}`https://figshare.com/ndownloader/files/41668983`
- Be sure to extract the files from these archives!
Please use the program "tar" to extract the archives,
- Be sure to extract the files from these archives!
Please use the program "tar" to extract the archives,
on either macOS/Linux or Windows.
Using other programs like WinZIP on Windows
Using other programs like WinZIP on Windows
can corrupt the files when extracting them,
causing confusing errors.
Tar should be available on newer Windows systems
(as described
(as described
[here](https://learn.microsoft.com/en-us/virtualization/community/team-blog/2017/20171219-tar-and-curl-come-to-windows)).
- Alternatively you can copy the following command and then
paste it into a terminal to run a Python script
that will download and extract the files for you.
- Alternatively you can copy the following command and then
paste it into a terminal to run a Python script
that will download and extract the files for you.

:::{eval-rst}

.. tabs::

.. code-tab:: shell macOS / Linux

curl -sSL https://raw.githubusercontent.com/vocalpy/vak/main/src/scripts/download_autoannotate_data.py | python3 -

.. code-tab:: shell Windows

(Invoke-WebRequest -Uri https://raw.githubusercontent.com/vocalpy/vak/main/src/scripts/download_autoannotate_data.py -UseBasicParsing).Content | py -
:::

4. Download the corresponding configuration files (click to download):
{download}`gy6or6_train.toml <../toml/gy6or6_train.toml>`,
{download}`gy6or6_eval.toml <../toml/gy6or6_eval.toml>`,
{download}`gy6or6_eval.toml <../toml/gy6or6_eval.toml>`,
and {download}`gy6or6_predict.toml <../toml/gy6or6_predict.toml>`

## Overview
Expand Down Expand Up @@ -181,7 +181,7 @@ Change the part of the path in capital letters to the actual location
on your computer:

```toml
[PREP]
[vak.prep]
dataset_type = "frame classification"
input_type = "spect"
# we change the next line
Expand Down Expand Up @@ -230,11 +230,11 @@ When you run `prep`, `vak` converts the data from `data_dir` into a special data
automatically adds the path to that file to the `[TRAIN]` section of the `config.toml` file, as the option
`csv_path`.

You have now prepared a dataset for training a model!
You'll probably have more questions about
how to do this later,
when you start to work with your own data.
When that time comes, please see the how-to page:
You have now prepared a dataset for training a model!
You'll probably have more questions about
how to do this later,
when you start to work with your own data.
When that time comes, please see the how-to page:
{ref}`howto-prep-annotate`.
For now, let's move on to training a neural network with this dataset.

Expand Down Expand Up @@ -294,7 +294,7 @@ from that checkpoint later when we predict annotations for new data.

(prepare-prediction-dataset)=

An important step when using neural network models is to evaluate the model's performance
An important step when using neural network models is to evaluate the model's performance
on a held-out dataset that has never been used during training, often called the "test" set.

Here we show you how to evaluate the model we just trained.
Expand Down Expand Up @@ -356,33 +356,33 @@ This file will also be found in the root `results_{timestamp}` directory.
spect_scaler = "/home/users/You/Data/vak_tutorial_data/vak_output/results_{timestamp}/SpectScaler"
```

The last path you need is actually in the TOML file that we used
The last path you need is actually in the TOML file that we used
to train the neural network: `dataset_path`.
You should copy that `dataset_path` option exactly as it is
and then paste it at the bottom of the `[EVAL]` table
You should copy that `dataset_path` option exactly as it is
and then paste it at the bottom of the `[EVAL]` table
in the configuration file for evaluation.
We do this instead of preparing another dataset,
because we already created a test split when we ran
We do this instead of preparing another dataset,
because we already created a test split when we ran
`vak prep` with the training configuration.
This is a good practice, because it helps ensure
This is a good practice, because it helps ensure
that we do not mix the training data with the test data;
`vak` makes sure that the data from the `data_dir` option
`vak` makes sure that the data from the `data_dir` option
is placed in two separate splits, the train and test splits.

Once you have prepared the configuration file as described,
Once you have prepared the configuration file as described,
you can run the following in the terminal:

```shell
vak eval gy6o6_eval.toml
```

You will see output to the console as the network is evaluated.
Notice that for this model we evaluate it *with* and *without*
post-processing transforms that clean up the predictions
You will see output to the console as the network is evaluated.
Notice that for this model we evaluate it *with* and *without*
post-processing transforms that clean up the predictions
of the model.
The parameters of the post-processing transform are specified
The parameters of the post-processing transform are specified
with the `post_tfm_kwargs` option in the configuration file.
You may find this helpful to understand factors affecting
You may find this helpful to understand factors affecting
the performance of your own model.

## 4. Preparing a prediction dataset
Expand All @@ -400,7 +400,7 @@ Just like before, you're going to modify the `data_dir` option of the
This time you'll change it to the path to the directory with the other day of data we downloaded.

```toml
[PREP]
[vak.prep]
data_dir = "/home/users/You/Data/vak_tutorial_data/032312"
```

Expand Down Expand Up @@ -428,7 +428,7 @@ and then add the path to that file as the option `csv_path` in the `[PREDICT]` s
Finally you will use the trained network to predict annotations.
This is the part that requires you to find paths to files saved by `vak`.

There's three you need. These are the exact same paths we used above
There's three you need. These are the exact same paths we used above
in the configuration file for evaluation, so you can copy them from that file.
We explain them again here for completeness.
All three paths will be in the `results` directory
Expand Down
22 changes: 7 additions & 15 deletions doc/reference/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ for each class.
## Valid section names

Following is the set of valid section names:
`{PREP, SPECT_PARAMS, DATALOADER, TRAIN, PREDICT, LEARNCURVE}`.
`{eval, learncurve, predict, prep, train}`.
In the code, these names correspond to attributes
of the main `Config` class, as shown below.

Expand All @@ -43,50 +43,42 @@ that are considered valid.
Valid options for each section are presented below.

(ref-config-prep)=
### `[PREP]` section
### `[vak.prep]` section

```{eval-rst}
.. autoclass:: vak.config.prep.PrepConfig
```

(ref-config-spect-params)=
### `[SPECT_PARAMS]` section
### `[vak.prep.spect_params]` section

```{eval-rst}
.. autoclass:: vak.config.spect_params.SpectParamsConfig
```

(ref-config-dataloader)=
### `[DATALOADER]` section

```{eval-rst}
.. autoclass:: vak.config.dataloader.DataLoaderConfig

```

(ref-config-train)=
### `[TRAIN]` section
### `[vak.train]` section

```{eval-rst}
.. autoclass:: vak.config.train.TrainConfig
```

(ref-config-eval)=
### `[EVAL]` section
### `[vak.eval]` section

```{eval-rst}
.. autoclass:: vak.config.eval.EvalConfig
```

(ref-config-predict)=
### `[PREDICT]` section
### `[vak.predict]` section

```{eval-rst}
.. autoclass:: vak.config.predict.PredictConfig
```

(ref-config-learncurve)=
### `[LEARNCURVE]` section
### `[vak.learncurve]` section

```{eval-rst}
.. autoclass:: vak.config.learncurve.LearncurveConfig
Expand Down
22 changes: 10 additions & 12 deletions doc/toml/gy6or6_eval.toml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
[PREP]
[vak.prep]
# dataset_type: corresponds to the model family such as "frame classification" or "parametric umap"
dataset_type = "frame classification"
# input_type: input to model, either audio ("audio") or spectrogram ("spect")
Expand All @@ -19,16 +19,15 @@ train_dur = 50
val_dur = 15

# SPECT_PARAMS: parameters for computing spectrograms
[SPECT_PARAMS]
[vak.prep.spect_params]
# fft_size: size of window used for Fast Fourier Transform, in number of samples
fft_size = 512
# step_size: size of step to take when computing spectra with FFT for spectrogram
# also known as hop size
step_size = 64

# EVAL: options for evaluating a trained model. This is done using the "test" split.
[EVAL]
model = "TweetyNet"
[vak.eval]
# checkpoint_path: path to saved model checkpoint
checkpoint_path = "/PATH/TO/FOLDER/results/train/RESULTS_TIMESTAMP/TweetyNet/checkpoints/max-val-acc-checkpoint.pt"
# labelmap_path: path to file that maps from outputs of model (integers) to text labels in annotations;
Expand All @@ -51,7 +50,7 @@ output_dir = "/PATH/TO/FOLDER/results/eval"
# ADD THE dataset_path OPTION FROM THE TRAIN FILE HERE (we already created a test split when we ran `vak prep` with that config)

# EVAL.post_tfm_kwargs: options for post-processing
[EVAL.post_tfm_kwargs]
[vak.eval.post_tfm_kwargs]
# both these transforms require that there is an "unlabeled" label,
# and they will only be applied to segments that are bordered on both sides
# by the "unlabeled" label.
Expand All @@ -65,12 +64,11 @@ majority_vote = true
# Only applied if this option is specified.
min_segment_dur = 0.02

# transform_params: parameters used when transforming data
# for a frame classification model, we use FrameDataset with the eval_item_transform,
# that reshapes batches into consecutive adjacent windows with a specific `window_size`
[EVAL.transform_params]
# dataset.params = parameters used for datasets
# for a frame classification model, we use dataset classes with a specific `window_size`
[vak.eval.dataset.params]
window_size = 176

# Note we do not specify any options for the network, and just use the defaults
# We need to put this "dummy" table here though for the config to parse correctly
[TweetyNet]
# Note we do not specify any options for the model, and just use the defaults
# We need to put this table here though so we know which model we are using
[vak.eval.model.TweetyNet]
19 changes: 8 additions & 11 deletions doc/toml/gy6or6_predict.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# PREP: options for preparing dataset
[PREP]
[vak.prep]
# dataset_type: corresponds to the model family such as "frame classification" or "parametric umap"
dataset_type = "frame classification"
# input_type: input to model, either audio ("audio") or spectrogram ("spect")
Expand All @@ -15,17 +15,15 @@ audio_format = "wav"
# all data found in `data_dir` will be assigned to a "predict split" instead

# SPECT_PARAMS: parameters for computing spectrograms
[SPECT_PARAMS]
[vak.prep.spect_params]
# fft_size: size of window used for Fast Fourier Transform, in number of samples
fft_size = 512
# step_size: size of step to take when computing spectra with FFT for spectrogram
# also known as hop size
step_size = 64

# PREDICT: options for generating predictions with a trained model
[PREDICT]
# model: the string name of the model. must be a name within `vak.models` or added e.g. with `vak.model.decorators.model`
model = "TweetyNet"
[vak.predict]
# checkpoint_path: path to saved model checkpoint
checkpoint_path = "/PATH/TO/FOLDER/results/train/RESULTS_TIMESTAMP/TweetyNet/checkpoints/max-val-acc-checkpoint.pt"
# labelmap_path: path to file that maps from outputs of model (integers) to text labels in annotations;
Expand Down Expand Up @@ -61,12 +59,11 @@ majority_vote = true
min_segment_dur = 0.01
# dataset_path : path to dataset created by prep. This will be added when you run `vak prep`, you don't have to add it

# transform_params: parameters used when transforming data
# for a frame classification model, we use FrameDataset with the eval_item_transform,
# that reshapes batches into consecutive adjacent windows with a specific `window_size`
[PREDICT.transform_params]
# dataset.params = parameters used for datasets
# for a frame classification model, we use dataset classes with a specific `window_size`
[vak.predict.dataset.params]
window_size = 176

# Note we do not specify any options for the network, and just use the defaults
# We need to put this "dummy" table here though for the config to parse correctly
[TweetyNet]
# We need to put this table here though, to indicate which model we are using.
[vak.predict.model.TweetyNet]
29 changes: 12 additions & 17 deletions doc/toml/gy6or6_train.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# PREP: options for preparing dataset
[PREP]
[vak.prep]
# dataset_type: corresponds to the model family such as "frame classification" or "parametric umap"
dataset_type = "frame classification"
# input_type: input to model, either audio ("audio") or spectrogram ("spect")
Expand All @@ -22,17 +22,15 @@ val_dur = 15
test_dur = 30

# SPECT_PARAMS: parameters for computing spectrograms
[SPECT_PARAMS]
[vak.prep.spect_params]
# fft_size: size of window used for Fast Fourier Transform, in number of samples
fft_size = 512
# step_size: size of step to take when computing spectra with FFT for spectrogram
# also known as hop size
step_size = 64

# TRAIN: options for training model
[TRAIN]
# model: the string name of the model. must be a name within `vak.models` or added e.g. with `vak.model.decorators.model`
model = "TweetyNet"
[vak.train]
# root_results_dir: directory where results should be saved, as a sub-directory within `root_results_dir`
root_results_dir = "/PATH/TO/FOLDER/results/train"
# batch_size: number of samples from dataset per batch fed into network
Expand All @@ -58,23 +56,20 @@ num_workers = 4
device = "cuda"
# dataset_path : path to dataset created by prep. This will be added when you run `vak prep`, you don't have to add it

# train_dataset_params: parameters used when loading training dataset
# for a frame classification model, we use a WindowDataset with a specific `window_size`
[TRAIN.train_dataset_params]
# dataset.params = parameters used for datasets
# for a frame classification model, we use dataset classes with a specific `window_size`
[vak.train.dataset.params]
window_size = 176

# val_transform_params: parameters used when transforming validation data
# for a frame classification model, we use FrameDataset with the eval_item_transform,
# that reshapes batches into consecutive adjacent windows with a specific `window_size`
[TRAIN.val_transform_params]
window_size = 176

# TweetyNet.optimizer: we specify options for the model's optimizer in this table
[TweetyNet.optimizer]
# To indicate the model to train, we use a "dotted key" with `model` followed by the string name of the model.
# This name must be a name within `vak.models` or added e.g. with `vak.model.decorators.model`
# We use another dotted key to indicate options for configuring the model, e.g. `TweetyNet.optimizer`
[vak.train.model.TweetyNet.optimizer]
# vak.train.model.TweetyNet.optimizer: we specify options for the model's optimizer in this table
# lr: the learning rate
lr = 0.001

# TweetyNet.network: we specify options for the model's network in this table
[TweetyNet.network]
[vak.train.model.TweetyNet.network]
# hidden_size: the number of elements in the hidden state in the recurrent layer of the network
hidden_size = 256
Loading
Loading