From aad30d5d6adfe2aa3a88ce46de524deb22d649d7 Mon Sep 17 00:00:00 2001
From: tapadipti <32855442+tapadipti@users.noreply.github.com>
Date: Thu, 24 Aug 2023 15:39:42 +0530
Subject: [PATCH 1/4] Studio update images (#4758)
* Update text and image for project action buttons
* Studio: Explore exps - Update screenshots and adjust text accordingly
* Document nested branches and explain when the commits in branch filter is useful
* Address PR comment
* Address PR comment
* Small clarification
* Address Oded's comments in the PR and make 'nested branches' and 'commits on branch' filter clearer
* Committing one of the changed files after yarn fix-all
* Revert "Committing one of the changed files after yarn fix-all"
This reverts commit 5479bd76d38d466e4800b7126fc010d6605bdddb.
---
.../explore-ml-experiments.md | 162 ++++++++++++------
1 file changed, 111 insertions(+), 51 deletions(-)
diff --git a/content/docs/studio/user-guide/projects-and-experiments/explore-ml-experiments.md b/content/docs/studio/user-guide/projects-and-experiments/explore-ml-experiments.md
index af6e3c2472..329f6bf71a 100644
--- a/content/docs/studio/user-guide/projects-and-experiments/explore-ml-experiments.md
+++ b/content/docs/studio/user-guide/projects-and-experiments/explore-ml-experiments.md
@@ -1,34 +1,24 @@
# Explore ML Experiments
-The projects dashboard in Iterative Studio contains all your projects. Open a
-project by clicking on its name. An experiments table for the project will be
-generated as shown below. This includes metrics, hyperparameters, and
-information about datasets and models.
+The projects dashboard in Iterative Studio contains all your projects. Click on
+a project name to open the project table, which contains:
-![](https://static.iterative.ai/img/studio/view_components.png)
-
-The major components of a project table are:
-
-- [Git history and live experiments](#git-history-and-live-metrics) that show
- you the complete experimentation history as well as live metrics of running
- experiments.
-- [Display preferences](#display-preferences) that let you show/hide branches,
- commits and columns, and re-arrange the table.
+- [Git history and live experiments](#git-history-and-live-metrics) of the
+ project
+- [Display preferences](#display-preferences)
- Buttons to
[visualize, compare, and run experiments](#visualize-compare-and-run-experiments).
- Button to [export project data](#export-project-data).
## Git history and live experiments
-The branches and commits in your Git repository are displayed along with the
+Branches and commits in your Git repository are displayed along with the
corresponding models, metrics, hyperparameters, and DVC-tracked files.
-[New experiments submitted from Iterative Studio][run experiments] appear as
-experiment commits, which are eventually pushed to Git. Experiments that you
-push using the `dvc exp push` command as well as any live experiments that you
-send using [DVCLive] are displayed in a special experiment row nested under the
-parent Git commit. More details of how live experiments are displayed can be
-found
+Experiments that you push using the `dvc exp push` command as well as any live
+experiments that you send using [DVCLive] are displayed in a special experiment
+row nested under the parent Git commit. More details of how live experiments are
+displayed can be found
[here](/doc/studio/user-guide/projects-and-experiments/live-metrics-and-plots#view-live-metrics-and-plots).
To manually check for updates in your repository, use the `Reload` button 🔄
@@ -38,43 +28,95 @@ located above the project table.
![](https://static.iterative.ai/img/studio/view_components_1.gif)
+### Nested branches
+
+When a Git branch (e.g., `feature-branch-1`) is merged into another branch
+(e.g., `main`), two possibilities exist:
+
+- `feature-branch-1` is still active. That is, the user continues to push more
+ commits to this branch. Since the branch now contains new unique commits, the
+ project table will display both `main` and `feature-branch-1` separately.
+ `feature-branch-1` will show the new commits that are not part of `main` while
+ all the merged commits will be shown inside `main`.
+
+- `feature-branch-1` is inactive. That is, the user does NOT push any more
+ commits to this branch. Since the branch does not contain any new unique
+ commits, Iterative Studio considers `feature-branch-1` as **"nested"** within
+ `main` and does not display it as a separate branch. This helps to keep the
+ project table concise and reduce clutter that can accumulate over time when
+ inactive branches are not cleaned from the Git repository. After all, those
+ inactive branches usually carry no new information for the purpose of managing
+ experiments. If you would like to display all commits of such an inactive
+ branch, use the
+ [`Commits on branch = feature-branch-1` display filter](#filters).
+
## Display preferences
The table contains buttons to specify filters and other preferences regarding
which commits and columns to display.
-![](https://static.iterative.ai/img/studio/view_components_2.gif)
-
### Filters:
-You can filter the commits that you want to display by the following fields:
-
-- **Branch:** The Git branch
-- **Tag:** The Git tag
-- **Author:** Author of the Git commit
-- **Metric:** Values of different metrics. For instance, you can display only
- those experiments for which the value of `avg_prec` is greater than `0.9`.
-- **Metric delta:** Change in the value of the metric. For instance, you can use
- this filter to only display those experiments for which the value of
- `avg_prec` changed by more than `0.1` compared to the baseline experiment.
-- **Param:** Values of different parameters
-- **File size:** Size of the data, model and other files corresponding to your
- experiments
-- **File changed:** Whether or not any given file changed in the experiment
+Click on the `Filters` button to specify which rows you want to show in the
+project table.
+
+![Project filters](https://static.iterative.ai/img/studio/project_filters.png)
+
+There are two types of filters:
+
+- **Quick filters** (highlighted in orange above): Use the quick filter buttons
+ to
+
+ - Show only DVC experiments
+ - Show only selected experiments
+ - Toggle hidden commits (include or exclude hidden commits in the project
+ table)
+
+- **Custom filters** (highlighted in purple above): Filter commits by one or
+ more of the following fields:
+
+ - Column values (values of metrics, hyperparameters, etc.) and their deltas
+ - Git related fields such as Git branch, commit message, tag and author
+
+
+
+ The `Branch` filter displays only the specified branch and its commits.
+
+ On the other hand, the `Commits on branch` filter will also display branches
+ [inside which the specified branch is nested](#nested-branches).
+
+ When a Git branch is nested inside another branch, the project table
+ [does not display the nested branch](#nested-branches). If
+ `feature-branch-1` is nested within `main`, `feature-branch-1` is NOT
+ displayed in the project table even if you apply the
+ `Branch = feature-brach-1` filter.
+
+ In this case, if you would like to filter for commits in `feature-branch-1`,
+ you should use the `Commits on branch = feature-branch-1` filter. This will
+ display the `main` branch with commits that were merged from
+ `feature-branch-1` into `main`. A hint is present to indicate that even
+ though the commits appear inside `main`, they are part of the nested branch
+ `feature-branch-1`.
+
+ ![Result of commits on branch filter](https://static.iterative.ai/img/studio/commits_on_branch_filter.png)
+
+
### Columns:
Select the columns you want to display and hide the rest.
![Showing and hiding columns](https://static.iterative.ai/img/studio/show_hide_columns.gif)
-You can also click and drag the columns in the table to rearrange them.
-
If your project is missing some required columns or includes columns that you do
not want, refer to the following troubleshooting sections:
- [Project does not contain the columns that I want](/doc/studio/troubleshooting#project-does-not-contain-the-columns-that-i-want)
- [Project contains columns that I did not import](/doc/studio/troubleshooting#project-contains-columns-that-i-did-not-import)
+To reorder the columns, click and drag them in the table or from the Columns
+dropdown.
+![Showing and hiding columns](https://static.iterative.ai/img/studio/reorder_columns.gif)
+
### Hide commits:
Commits can be hidden from the project table in the following ways:
@@ -98,39 +140,56 @@ Commits can be hidden from the project table in the following ways:
commits that do not add much value in your project. To hide a commit or
branch, click on the 3-dot menu next to the commit or branch name and click on
`Hide commit` or `Hide branch`.
+
+ ![Hide commit](https://static.iterative.ai/img/studio/hide_commit.png)
+
- **Unhide commits:** You can unhide commits as needed, so that you don't lose
any experimentation history. To display all hidden commits, click on the
- `Show hidden commits` toggle (refer [the above gif](#display-preferences)).
- This will display all hidden commits, with a `hidden` (closed eye) indicator.
+ `Show hidden commits` toggle (refer [filters](#filters)). This will display
+ all hidden commits, with a `hidden` (closed eye) indicator.
+
+ ![Hidden commit indicator](https://static.iterative.ai/img/studio/hidden_commit_indicator.png)
+
To unhide any commit, click on the 3-dot menu for that commit and click on
`Show commit`.
-### Selected only:
+ ![Show hidden commit](https://static.iterative.ai/img/studio/show_hidden_commit.png)
-Toggle between showing and hiding experiments that you have not selected.
+### Delta mode
-### Delta mode:
+For metrics, models and files columns with numeric values, you can display
+either the absolute values or their delta (difference) from the baseline row. To
+toggle between these two options, use the `Delta mode` button.
-Toggle between absolute values and difference from the baseline row.
+![Delta mode](https://static.iterative.ai/img/studio/delta_mode.png)
### Save changes:
-Save your filters or column display preferences so that these preferences remain
-intact even after you log out of Iterative Studio and log back in later.
+Whenever you make any changes to your project's columns, commits or filters, a
+notification to save or discard your changes is displayed at the top of the
+project table. Saved changes remain intact even after you log out of Iterative
+Studio and log back in later.
+
+![Save or discard changes](https://static.iterative.ai/img/studio/save_discard_changes.png)
## Visualize, compare and run experiments
Use the following buttons to visualize, compare and run experiments:
-- **Show plots:** Open the `Plots` pane and [display plots] for the selected
- commits.
+- **Plots:** Open the `Plots` pane and [display plots] for the selected commits.
+- **Trends:** [Generate trend charts] to see how the metrics have changed over
+ time.
- **Compare:** [Compare experiments] side by side.
- **Run:** [Run experiments] and [track results in real
time][live-metrics-and-plots].
-- **Trends:** [Generate trend charts] to see how the metrics have changed over
- time.
-![](https://static.iterative.ai/img/studio/view_components_3.gif)
+These buttons appear above your project table as shown below.
+![example export to csv](https://static.iterative.ai/img/studio/project_action_buttons_big_screen.png)
+
+On smaller screens, the buttons might appear without text labels, as shown
+below.
+
+![example export to csv](https://static.iterative.ai/img/studio/project_action_buttons_small_screen.png)
## Export project data
@@ -143,6 +202,7 @@ Below is an example of the downloaded CSV file.
![example export to csv](https://static.iterative.ai/img/studio/project_export_to_csv_example.png)
+[DVCLive]: /doc/dvclive
[display plots]:
/doc/studio/user-guide/projects-and-experiments/visualize-and-compare#display-plots-and-images
[Compare experiments]:
From 9b8b48d30e3f7563b4f547fa33062a2c8d5766f7 Mon Sep 17 00:00:00 2001
From: "github-actions[bot]"
<41898282+github-actions[bot]@users.noreply.github.com>
Date: Thu, 24 Aug 2023 11:21:55 -0700
Subject: [PATCH 2/4] dvc 3.16.0 (#4797)
Co-authored-by: Olivaw[bot]
---
src/components/DownloadButton/index.tsx | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/components/DownloadButton/index.tsx b/src/components/DownloadButton/index.tsx
index 7fb6a42cdd..85c37e494e 100644
--- a/src/components/DownloadButton/index.tsx
+++ b/src/components/DownloadButton/index.tsx
@@ -9,7 +9,7 @@ import { logEvent } from '@dvcorg/gatsby-theme-iterative/src/utils/front/plausib
import * as styles from './styles.module.css'
import { OS, useUserOS } from '../../utils/front/useUserOS'
-const VERSION = `3.15.3`
+const VERSION = `3.16.0`
const dropdownItems = [
OS.UNKNOWN,
From b07857f304cfcf0fe3e0fde38c25fd9df7ccd885 Mon Sep 17 00:00:00 2001
From: Dave Berenbaum
Date: Thu, 24 Aug 2023 18:41:13 -0400
Subject: [PATCH 3/4] dvclive-first metrics and plots (#4795)
* dvclive-first metrics and plots
* fix linting issues
* make clear that adding metrics/plots outs is optional
* minor updates
---
.../docs/command-reference/metrics/diff.md | 51 ++++++-----
.../docs/command-reference/metrics/index.md | 84 +++++++++----------
content/docs/dvclive/how-it-works.md | 13 +--
content/docs/dvclive/index.md | 18 ++++
.../docs/user-guide/integrations/sagemaker.md | 4 +-
.../pipelines/defining-pipelines.md | 26 ++----
.../project-structure/dvcyaml-files.md | 17 +++-
7 files changed, 116 insertions(+), 97 deletions(-)
diff --git a/content/docs/command-reference/metrics/diff.md b/content/docs/command-reference/metrics/diff.md
index deee9d71b0..328d8d4ebb 100644
--- a/content/docs/command-reference/metrics/diff.md
+++ b/content/docs/command-reference/metrics/diff.md
@@ -88,31 +88,38 @@ all the current metrics (without comparisons).
## Examples
-Start by creating a metrics file and commit it (see the `-M` option of
-`dvc stage add` for more details):
+Start with a simple Python script to generate metrics:
-```cli
-$ dvc stage add -n eval -M metrics.json \
- 'echo {"AUC": 0.9643, "TP": 527} > metrics.json'
+```python
+# train.py
+import random
+from dvclive import Live
-$ dvc repro
+with Live() as live:
+ live.log_metric("AUC", random.random())
+ live.log_metric("TP", random.randint(0, 1000))
+```
-$ cat metrics.json
-{"AUC": 0.9643, "TP": 527}
+Run the script and commit it:
-$ git add dvc.* metrics.json
-$ git commit -m "Add metrics file"
+```cli
+$ python train.py
+$ git add train.py dvclive
+$ git commit -m "Add metrics"
```
Now let's simulate a change in our AUC metric:
```cli
-$ echo '{"AUC":0.9671, "TP":531}' > metrics.json
-
-$ git diff
-...
--{"AUC":0.9643, "TP":527}
-+{"AUC":0.9671, "TP":531}
+$ python train.py
+
+$ git diff -- dvclive/metrics.json
+ {
+- "AUC": 0.7891189181402177,
+- "TP": 215
++ "AUC": 0.18113944203594523,
++ "TP": 768
+ }
```
To see the change, let's run `dvc metrics diff`. This compares our current
@@ -121,9 +128,9 @@ had in the latest commit (`HEAD`):
```cli
$ dvc metrics diff
-Path Metric HEAD workspace Change
-metrics.json AUC 0.9643 0.9671 0.0028
-metrics.json TP 527 531 4
+Path Metric HEAD workspace Change
+dvclive/metrics.json AUC 0.78912 0.18114 -0.60798
+dvclive/metrics.json TP 215 768 553
```
## Example: compare metrics among specific versions
@@ -133,7 +140,7 @@ two [revisions](https://git-scm.com/docs/revisions)):
```cli
$ dvc metrics diff --targets metrics.json -- 305fb8b c7bef55
-Path Metric 305fb8b c7bef55 Change
-metrics.json AUC 0.9643 0.9743 0.0100
-metrics.json TP 527 516 -11
+Path Metric 305fb8b c7bef55 Change
+dvclive/metrics.json AUC 0.9643 0.9743 0.0100
+dvclive/metrics.json TP 527 516 -11
```
diff --git a/content/docs/command-reference/metrics/index.md b/content/docs/command-reference/metrics/index.md
index 87b26b69b8..ea7d743b78 100644
--- a/content/docs/command-reference/metrics/index.md
+++ b/content/docs/command-reference/metrics/index.md
@@ -18,16 +18,13 @@ positional arguments:
## Description
In order to follow the performance of machine learning experiments, DVC has the
-ability to mark stage outputs or other files as metrics. These
-metrics are project-specific floating-point or integer values e.g. AUC, ROC,
-false positives, etc.
+ability to mark [structured files](#supported-file-formats) containing key/value
+pairs as metrics. These metrics are project-specific floating-point, integer, or
+string values e.g. AUC, ROC, false positives, etc.
-In pipelines, metrics files are typically generated by user data
-processing code, and are tracked using the `-m` (`--metrics`) and `-M`
-(`--metrics-no-cache`) options of `dvc stage add`. If using
-[DVCLive](/doc/dvclive/live/log_metric), the files are generated and tracked
-automatically. Metrics files may also may be manually added to
-[`dvc.yaml`](/doc/user-guide/project-structure/dvcyaml-files).
+If using [DVCLive](/doc/dvclive/live/log_metric), the files are generated and
+metrics are configured automatically. Metrics files also may be manually added
+to [`dvc.yaml`](/doc/user-guide/project-structure/dvcyaml-files).
In contrast to `dvc plots`, these metrics should be stored in hierarchical
files. Unlike its `dvc plots` counterpart, `dvc metrics diff` can report the
@@ -42,26 +39,14 @@ metrics.json AUC 0.763981 0.801807 0.037826
`dvc metrics` subcommands can be used on any
[valid metrics files](#supported-file-formats). By default they use the ones
-specified in `dvc.yaml` (if any), for example `summary.json` below:
+specified in `dvc.yaml` (if any), including those added automatically by
+DVCLive. For example, `summary.json` below:
```yaml
-stages:
- train:
- cmd: python train.py
- deps:
- - users.csv
- outs:
- - model.pkl
- metrics:
- - summary.json:
- cache: false
+metrics:
+ - summary.json
```
-> `cache: false` above specifies that `summary.json` is not tracked or
-> cached by DVC (`-M` option of `dvc stage add`). These metrics
-> files are normally committed with Git instead. See `dvc.yaml` for more
-> information on the file format above.
-
### Supported file formats
Metrics can be organized as tree hierarchies in JSON, TOML 1.0, or YAML 1.2
@@ -96,29 +81,44 @@ to compare and pick the best performing experiment.
## Examples
-First, let's imagine we have a simple [stage](/doc/command-reference/run) that
-produces an `eval.json` metrics file:
+First, let's imagine we have a simple Python script using DVCLive to log some
+metrics:
-```cli
-$ dvc stage add -n evaluate -d code/evaluate.py -M eval.json \
- python code/evaluate.py
+```python
+from dvclive import Live
-$ dvc repro
+with Live() as live:
+ ...
+ live.log_metric("AUC", auc)
+ live.log_metric("error", error)
+ live.log_metric("TP", tp)
```
-> `-M` (`--metrics-no-cache`) tells DVC to mark `eval.json` as a metrics file,
-> without tracking it directly (You can track it with Git). See `dvc stage add`
-> for more info.
+This will generate some log files, including `dvclive/metrics.json`, which looks
+like:
+
+```json
+{
+ "AUC": 0.66729,
+ "error": 0.16982,
+ "TP": 516
+}
+```
+
+It will also generate `dvclive/dvc.yaml`, which includes:
+
+```yaml
+metrics:
+ - metrics.json
+```
Now let's print metrics values that we are tracking in this
project, using `dvc metrics show`:
```cli
$ dvc metrics show
- eval.json:
- AUC: 0.66729
- error: 0.16982
- TP: 516
+Path AUC TP error
+dvclive/metrics.json 0.66729 516 0.16982
```
When there are metrics file changes (before committing them with Git), the
@@ -127,8 +127,8 @@ When there are metrics file changes (before committing them with Git), the
```cli
$ dvc metrics diff
-Path Metric HEAD workspace Change
-eval.json AUC 0.65115 0.66729 0.01614
-eval.json error 0.1666 0.16982 0.00322
-eval.json TP 528 516 -12
+Path Metric HEAD workspace Change
+dvclive/metrics.json AUC 0.65115 0.66729 0.01614
+dvclive/metrics.json error 0.1666 0.16982 0.00322
+dvclive/metrics.json TP 528 516 -12
```
diff --git a/content/docs/dvclive/how-it-works.md b/content/docs/dvclive/how-it-works.md
index 9c3146c8dc..0cc41557c9 100644
--- a/content/docs/dvclive/how-it-works.md
+++ b/content/docs/dvclive/how-it-works.md
@@ -107,19 +107,12 @@ Using `Live.log_image()` to log multiple images may also grow too large to track
with Git, in which case you can use
[`Live(cache_images=True)`](/doc/dvclive/live#parameters) to cache them.
-## Run with DVC
-
-Experimenting in Python interactively (like in notebooks) is great for
-exploration, but eventually you may need a more structured way to run
-reproducible experiments (for example, running a multi-step pipeline or queueing
-multiple experiments). By configuring DVC [pipelines], you can
-[run experiments](/doc/user-guide/experiment-management/running-experiments)
-with `dvc exp run`. This will track the inputs and outputs of your code, and
-also enable features like queuing, parameter tuning, and grid searches.
+## Setup to Run with DVC
DVCLive by default [generates] its own `dvc.yaml` file to configure the
experiment results, but you can create your own `dvc.yaml` file at the base of
-your repository (or elsewhere) to define a [pipeline](#run-with-dvc) or
+your repository (or elsewhere) to define a [pipeline](#setup-to-run-with-dvc) to
+run experiments with DVC or
[customize plots](/doc/user-guide/experiment-management/visualizing-plots#defining-plots).
Do not reuse the DVCLive `dvc.yaml` file since it gets overwritten during each
experiment run. A pipeline stage for model training might look like:
diff --git a/content/docs/dvclive/index.md b/content/docs/dvclive/index.md
index 175ccd1b0a..f7018df03b 100644
--- a/content/docs/dvclive/index.md
+++ b/content/docs/dvclive/index.md
@@ -154,3 +154,21 @@ with Live(save_dvc_exp=True) as live:
After you run your training code, all the logged data will be stored in the
`dvclive` directory. Check the [DVCLive outputs](/doc/dvclive/how-it-works) page
for more details.
+
+## Run with DVC
+
+Experimenting in Python interactively (like in notebooks) is great for
+exploration, but eventually you may need a more structured way to run
+reproducible experiments. By configuring DVC [pipelines], you can [run
+experiments] with `dvc exp run`. This will track the inputs and outputs of code,
+and enable more advanced workflows like multi-step pipelines and queueing
+multiple experiments or even an entire grid search. See examples of how to [add
+DVCLive to a pipeline] or [add a pipeline to DVCLive code], or get more
+information about how to [setup a pipeline] to work with DVCLive.
+
+[run experiments with DVC]:
+ /doc/user-guide/experiment-management/running-experiments
+[pipelines]: /doc/user-guide/pipelines
+[add DVCLive to a pipeline]: /doc/start/data-management/metrics-parameters-plots
+[add a pipeline to DVCLive code]: /doc/start/experiments/experiment-pipelines
+[setup a pipeline]: /doc/dvclive/how-it-works#setup-to-run-with-dvc
diff --git a/content/docs/user-guide/integrations/sagemaker.md b/content/docs/user-guide/integrations/sagemaker.md
index 6dd697f49f..ca2c90081a 100644
--- a/content/docs/user-guide/integrations/sagemaker.md
+++ b/content/docs/user-guide/integrations/sagemaker.md
@@ -64,7 +64,9 @@ modified easily. The DVC pipeline stage is defined in `dvc.yaml` like this:
```yaml
prepare:
cmd:
- - wget https://sagemaker-sample-data-us-west-2.s3-us-west-2.amazonaws.com/autopilot/direct_marketing/bank-additional.zip -O bank-additional.zip
+ - wget
+ https://sagemaker-sample-data-us-west-2.s3-us-west-2.amazonaws.com/autopilot/direct_marketing/bank-additional.zip
+ -O bank-additional.zip
- python sm_prepare.py --bucket ${bucket} --prefix ${prefix}
deps:
- sm_prepare.py
diff --git a/content/docs/user-guide/pipelines/defining-pipelines.md b/content/docs/user-guide/pipelines/defining-pipelines.md
index f6bfdcbccc..eb24ea4615 100644
--- a/content/docs/user-guide/pipelines/defining-pipelines.md
+++ b/content/docs/user-guide/pipelines/defining-pipelines.md
@@ -209,31 +209,17 @@ Use `dvc params diff` to compare parameters across project versions.
## Outputs
Stage outputs are files (or directories) written by pipelines, for
-example machine learning models, intermediate artifacts, as well as data [plots]
-and performance [metrics]. These files are cached by DVC
-automatically, and tracked with the help of `dvc.lock` files (or `.dvc` files,
-see `dvc add`).
+example machine learning models and intermediate artifacts. These files are
+cached by DVC automatically, and tracked with the help of
+`dvc.lock` files (or `.dvc` files, see `dvc add`).
Outputs can be dependencies of subsequent stages (as explained earlier). So when
they change, DVC may need to reproduce downstream stages as well (handled
automatically).
-The types of outputs are:
-
-- Files and directories: Typically data to feed to intermediate stages, as well
- as the final results of a pipeline (e.g. a dataset or an ML model).
-
-- [Metrics]: DVC supports small text files that usually contain model
- performance metrics from the evaluation, validation, or testing phases of the
- ML lifecycle. DVC allows to compare produced metrics with one another using
- `dvc metrics diff` and presents the results as a table with `dvc metrics show`
- or `dvc exp show`.
-
-- [Plots]: Different kinds of data that can be visually graphed. For example
- contrast ML performance statistics or continuous metrics from multiple
- experiments. `dvc plots show` can generate charts for certain data files or
- render custom image files for you, or you can compare different ones with
- `dvc plots diff`.
+DVC can also track [metrics] and [plots] files, which can optionally be added as
+stage outputs, or even added with `cache: false` in `dvc.yaml` since they are
+often small enough to store in Git.
diff --git a/content/docs/user-guide/project-structure/dvcyaml-files.md b/content/docs/user-guide/project-structure/dvcyaml-files.md
index bcb8c1e4fa..c833c65c9a 100644
--- a/content/docs/user-guide/project-structure/dvcyaml-files.md
+++ b/content/docs/user-guide/project-structure/dvcyaml-files.md
@@ -57,7 +57,7 @@ metrics:
Metrics are key/value pairs saved in structured files that map a metric name to
a numeric value. See `dvc metrics` for more information and how to compare among
-experiments.
+experiments, or [DVCLive] for a helper to log metrics.
## Params
@@ -90,7 +90,8 @@ DVC will create separate rendering for each type.
-Refer to [Visualizing Plots] and `dvc plots show` for more examples.
+Refer to [Visualizing Plots] and `dvc plots show` for more examples, and refer
+to [DVCLive] for a helper to log plots.
[visualizing plots]: /doc/user-guide/experiment-management/visualizing-plots
@@ -353,6 +354,16 @@ See also `dvc params diff` to compare params across project version.
### Metrics and Plots outputs
+
+
+Metrics and plots outputs described below come from earlier versions of DVC and
+remain as a convenience. You can instead define metrics and plots separate from
+your pipeline with [DVCLive] or add "top-level" [metrics](#metrics) and
+[plots](#plots). You can optionally include them as regular `outs` in the
+pipeline.
+
+
+
Like common outputs, metrics and plots files are
produced by the stage `cmd`. However, their purpose is different. Typically they
contain metadata to evaluate pipeline processes. Example:
@@ -898,3 +909,5 @@ Full parameter dependencies (both key and value) are listed too
`dvc.lock` (no `${}` expression). As for [`foreach` stages](#foreach-stages) and
[`matrix` stages](#matrix-stages), individual stages are expanded (no `foreach`
or `matrix` structures are preserved).
+
+[DVCLive]: /doc/dvclive
From 9ad0b97ad07d1baab74cabb70be221955004f85b Mon Sep 17 00:00:00 2001
From: Dave Berenbaum
Date: Fri, 25 Aug 2023 10:10:54 -0400
Subject: [PATCH 4/4] drop metrics/plots stage outputs (#4798)
---
content/docs/command-reference/plots/diff.md | 3 -
content/docs/command-reference/plots/index.md | 7 +-
.../docs/command-reference/plots/modify.md | 167 ----------------
content/docs/command-reference/plots/show.md | 9 +-
content/docs/sidebar.json | 4 -
.../visualizing-plots.md | 47 -----
.../project-structure/dvcyaml-files.md | 187 ++++++------------
7 files changed, 66 insertions(+), 358 deletions(-)
delete mode 100644 content/docs/command-reference/plots/modify.md
diff --git a/content/docs/command-reference/plots/diff.md b/content/docs/command-reference/plots/diff.md
index 5c99f3750c..5bc0023102 100644
--- a/content/docs/command-reference/plots/diff.md
+++ b/content/docs/command-reference/plots/diff.md
@@ -41,9 +41,6 @@ specified with the `--targets` option (any valid plots file is accepted).
The plot style can be customized with [plot templates], using the `--template`
option. See `dvc plots` to learn more about plots files and templates.
-> Note that the default behavior of this command can be modified per metrics
-> file with `dvc plots modify`.
-
Another way to display plots is the `dvc plots show` command, which just lists
all the current plots, without comparisons.
diff --git a/content/docs/command-reference/plots/index.md b/content/docs/command-reference/plots/index.md
index 6e2e436de9..30c581f083 100644
--- a/content/docs/command-reference/plots/index.md
+++ b/content/docs/command-reference/plots/index.md
@@ -2,14 +2,13 @@
A set of commands to visualize and compare data series or images from ML
projects: [show](/doc/command-reference/plots/show),
-[diff](/doc/command-reference/plots/diff),
-[modify](/doc/command-reference/plots/modify) and
+[diff](/doc/command-reference/plots/diff), and
[templates](/doc/command-reference/plots/templates).
## Synopsis
```usage
-usage: dvc plots [-h] [-q | -v] {show,diff,modify,templates} ...
+usage: dvc plots [-h] [-q | -v] {show,diff,templates} ...
positional arguments:
COMMAND
@@ -17,8 +16,6 @@ positional arguments:
definitions in `dvc.yaml`.
diff Show multiple versions of a plot by overlaying them
in a single image.
- modify Modify display properties of data-series plots
- defined in stages (has no effect on image plots).
templates List built-in plots templates or show JSON
specification for one.
```
diff --git a/content/docs/command-reference/plots/modify.md b/content/docs/command-reference/plots/modify.md
deleted file mode 100644
index 1fe72071e6..0000000000
--- a/content/docs/command-reference/plots/modify.md
+++ /dev/null
@@ -1,167 +0,0 @@
-# plots modify
-
-Modify display properties of data-series [plots](/doc/command-reference/plots)
-defined in stages.
-
-> ⚠️ Note that this command can modify only data-series plots. It has no effect
-> on image-type plots or any [top-level plot] definitions.
-
-[top-level plot]: /doc/user-guide/project-structure/dvcyaml-files#plots
-
-## Synopsis
-
-```usage
-usage: dvc plots modify [-h] [-q | -v] [-t ] [-x ]
- [-y ] [--no-header] [--title ]
- [--x-label ] [--y-label ]
- [--unset [ [ ...]]]
- target
-
-positional arguments:
- target Plots file to set properties for
- (defined at the stage level)
-```
-
-## Description
-
-It might be not convenient for users or automation systems to specify all the
-_display properties_ (such as `y-label`, `template`, `title`, etc.) each time
-plots are generated with `dvc plots show` or `dvc plots diff`. This command sets
-(or unsets) default display properties for a specific plots file.
-
-The path to the plots file `target` is required. It must be listed in a
-`dvc.yaml` file (see the `--plots` option of `dvc stage add`).
-`dvc plots modify` adds the display properties to `dvc.yaml`.
-
-Property names are passed as [options](#options) to this command (prefixed with
-`--`). These are based on the [Vega-Lite](https://vega.github.io/vega-lite/)
-specification.
-
-Note that a secondary use of this command is to convert output or simple
-`dvc metrics` file into a plots file (see an
-[example](#example-convert-any-output-into-a-plot)).
-
-## Options
-
-- `-t , --template ` - set a default
- [plot template](/doc/user-guide/experiment-management/visualizing-plots#plot-templates-data-series-only).
-
-- `-x ` - set a default field or column name (or number) from which the X
- axis data comes from.
-
-- `-y ` - set a default field or column name (or number) from which the Y
- axis data comes from.
-
-- `--x-label ` - set a default title for the X axis.
-
-- `--y-label ` - set a default title for the Y axis.
-
-- `--title ` - set a default plot title.
-
-- `--unset [ [ ...]]` - unset one or more display
- properties. Use the property name(s) without `--` in the argument sent to this
- option.
-
-- `--no-header` - lets DVC know that the `target` CSV or TSV does not have a
- header. A 0-based numeric index can be used to identify each column instead of
- names.
-
-- `-h`, `--help` - prints the usage/help message, and exit.
-
-- `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if no
- problems arise, otherwise 1.
-
-- `-v`, `--verbose` - displays detailed tracing information.
-
-## Examples
-
-The initial plot was showing the last column of CSV file by default which is
-_loss_ metrics while _accuracy_ is expected as Y axis:
-
-```
-epoch,accuracy,loss
-0,0.9403833150863647,0.2019129991531372
-1,0.9733833074569702,0.08973673731088638
-2,0.9815833568572998,0.06529958546161652
-3,0.9861999750137329,0.04984375461935997
-4,0.9882333278656006,0.041892342269420624
-```
-
-```cli
-$ dvc plots show logs.csv
-file:///Users/usr/src/myclassifier/logs.html
-```
-
-![](/img/plots_mod_loss.svg)
-
-Changing the y-axis to _accuracy_:
-
-```cli
-$ dvc plots modify logs.csv -y accuracy
-$ dvc plots show logs.csv
-file:///Users/usr/src/myclassifier/logs.html
-```
-
-![](/img/plots_mod_acc.svg)
-
-Note that a new field _y_ was added to `dvc.yaml` file for the plot. Make sure
-to commit the change in Git if the modification needs to be preserved.
-
-```yaml
-plots:
- - logs.csv:
- cache: false
- y: accuracy
-```
-
-Changing the plot `title` and `x-label`:
-
-```cli
-$ dvc plots modify logs.csv --title Accuracy -x epoch --x-label Epoch
-$ dvc plots show logs.csv
-file:///Users/usr/src/myclassifier/logs.html
-```
-
-![](/img/plots_mod_acc_titles.svg)
-
-Two new fields were added to `dvc.yaml`: `x-label` and `title`:
-
-```yaml
-plots:
- - plots.csv:
- cache: false
- y: accuracy
- x_label: epoch
- title: Accuracy
-```
-
-## Example: Template change
-
-Something like `dvc stage add --plots file.csv ...` assigns the default
-template, which needs to be changed in many cases. This command can do so:
-
-```cli
-$ dvc plots modify classes.csv --template confusion
-```
-
-## Example: Convert any output into a plot
-
-Let's take an example `evaluate` stage which has `logs.csv` as an output. We can
-use `dvc plots modify` to convert the `logs.csv` output file into a plots file,
-and then confirm the changes that happened in `dvc.yaml`:
-
-```cli
-$ dvc plots modify logs.csv
-```
-
-```git
- evaluate:
- cmd: python src/evaluate.py
- deps:
- - src/evaluate.py
-- outs:
-- - logs.csv
- plots:
- - scores.json
-+ - logs.csv
-```
diff --git a/content/docs/command-reference/plots/show.md b/content/docs/command-reference/plots/show.md
index defee7b828..099bb44dcf 100644
--- a/content/docs/command-reference/plots/show.md
+++ b/content/docs/command-reference/plots/show.md
@@ -30,13 +30,6 @@ All plots defined in `dvc.yaml` are used by default, but you can specify any
The plot style can be customized with [plot templates], using the `--template`
option. To learn more about plots file formats and templates, see `dvc plots`.
-
-
-The default behavior of this command can be modified per [stage plot] file with
-`dvc plots modify`.
-
-
-
[certain data]:
/doc/user-guide/experiment-management/visualizing-plots#supported-plot-file-formats
[plot templates]:
@@ -205,7 +198,7 @@ $ dvc plots show --no-header logs.csv -y 2
file:///Users/usr/src/dvc_plots/index.html
```
-## Example: Top-level plots
+## Example: `dvc.yaml` plots
### Simple plot definition
diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json
index 5cbd5b3928..fda47b7801 100644
--- a/content/docs/sidebar.json
+++ b/content/docs/sidebar.json
@@ -429,10 +429,6 @@
"label": "plots diff",
"slug": "diff"
},
- {
- "label": "plots modify",
- "slug": "modify"
- },
{
"label": "plots templates",
"slug": "templates"
diff --git a/content/docs/user-guide/experiment-management/visualizing-plots.md b/content/docs/user-guide/experiment-management/visualizing-plots.md
index c5a724bc8c..817e31a426 100644
--- a/content/docs/user-guide/experiment-management/visualizing-plots.md
+++ b/content/docs/user-guide/experiment-management/visualizing-plots.md
@@ -188,53 +188,6 @@ Refer to the [full format specification] and `dvc plots show` for more details.
-### Plot outputs
-
-Plots can use any file defined in the project, including outputs of
-[pipelines]:
-
-```yaml
-plots:
- - logs.csv:
- x: epoch
- y: loss
-stages:
- build:
- cmd: python train.py
- outs:
- - logs.csv
- ...
-```
-
-Alternatively, when defining [pipelines], some outputs (both files
-and directories) can be placed under a `plots` list for the corresponding stage
-in `dvc.yaml`. This will tell DVC that they are intended for visualization.
-
-
-
-When using `dvc stage add`, use `--plots/--plots-no-cache` instead of
-`--outs/--outs-no-cache`.
-
-
-
-```yaml
-stages:
- build:
- cmd: python train.py
- plots:
- - logs.csv:
- x: epoch
- y: loss
- ...
-```
-
-Marking stage outputs as plots is convenient for working with plots at the stage
-level, without having to write top-level `plots` definitions in `dvc.yaml`.
-However, stage-level plots do not support custom plot IDs or multiple data
-sources.
-
-[pipelines]: /doc/start/data-management/data-pipelines
-
## Plot templates (data-series only)
DVC uses [Vega-Lite](https://vega.github.io/vega-lite/) JSON specifications to
diff --git a/content/docs/user-guide/project-structure/dvcyaml-files.md b/content/docs/user-guide/project-structure/dvcyaml-files.md
index c833c65c9a..20880c399d 100644
--- a/content/docs/user-guide/project-structure/dvcyaml-files.md
+++ b/content/docs/user-guide/project-structure/dvcyaml-files.md
@@ -82,12 +82,6 @@ directory path (relative to the location of `dvc.yaml`) or an arbitrary string.
If the ID is an arbitrary string, a file path must be provided in the `y` field
(`x` file path is always optional and cannot be the only path provided).
-In addition to these "top-level plots," users can mark specific stage
-outputs as [plot outputs](#metrics-and-plots-outputs). DVC will
-collect both types and display everything conforming to each plot configuration.
-If any stage plot files or directories are also used in a top-level definition,
-DVC will create separate rendering for each type.
-
Refer to [Visualizing Plots] and `dvc plots show` for more examples, and refer
@@ -99,75 +93,66 @@ to [DVCLive] for a helper to log plots.
### Available configuration fields
-- `y` - source for the Y axis data:
-
- - **Top-level plots** (_string, list, dict_):
-
- If plot ID is a path, one or more column/field names is expected. For
- example:
-
- ```yaml
- plots:
- - regression_hist.csv:
- y: mean_squared_error
- - classifier_hist.csv:
- y: [acc, loss]
- ```
-
- If plot ID is an arbitrary string, a dictionary of file paths mapped to
- column/field names is expected. For example:
-
- ```yaml
- plots:
- - train_val_test:
- y:
- train.csv: [train_acc, val_acc]
- test.csv: test_acc
- ```
-
- - **Plot outputs** (_string_): one column/field name.
-
-- `x` - source for the X axis data. An auto-generated _step_ field is used by
- default.
-
- - **Top-level plots** (_string, dict_):
-
- If plot ID is a path, one column/field name is expected. For example:
-
- ```yaml
- plots:
- - classifier_hist.csv:
- y: [acc, loss]
- x: epoch
- ```
-
- If plot ID is an arbitrary string, `x` may either be one column/field name,
- or a dictionary of file paths each mapped to one column/field name (the
- number of column/field names must match the number in `y`).
-
- ```yaml
- plots:
- - train_val_test: # single x
- y:
- train.csv: [train_acc, val_acc]
- test.csv: test_acc
- x: epoch
- - roc_vs_prc: # x dict
- y:
- precision_recall.json: precision
- roc.json: tpr
- x:
- precision_recall.json: recall
- roc.json: fpr
- - confusion: # different x and y paths
- y:
- dir/preds.csv: predicted
- x:
- dir/actual.csv: actual
- template: confusion
- ```
-
- - **Plot outputs** (_string_): one column/field name.
+- `y` (_string, list, dict_) - source for the Y axis data:
+
+ If plot ID is a path, one or more column/field names is expected. For example:
+
+ ```yaml
+ plots:
+ - regression_hist.csv:
+ y: mean_squared_error
+ - classifier_hist.csv:
+ y: [acc, loss]
+ ```
+
+ If plot ID is an arbitrary string, a dictionary of file paths mapped to
+ column/field names is expected. For example:
+
+ ```yaml
+ plots:
+ - train_val_test:
+ y:
+ train.csv: [train_acc, val_acc]
+ test.csv: test_acc
+ ```
+
+- `x` (_string, dict_) - source for the X axis data. An auto-generated _step_
+ field is used by default.
+
+ If plot ID is a path, one column/field name is expected. For example:
+
+ ```yaml
+ plots:
+ - classifier_hist.csv:
+ y: [acc, loss]
+ x: epoch
+ ```
+
+ If plot ID is an arbitrary string, `x` may either be one column/field name, or
+ a dictionary of file paths each mapped to one column/field name (the number of
+ column/field names must match the number in `y`).
+
+ ```yaml
+ plots:
+ - train_val_test: # single x
+ y:
+ train.csv: [train_acc, val_acc]
+ test.csv: test_acc
+ x: epoch
+ - roc_vs_prc: # x dict
+ y:
+ precision_recall.json: precision
+ roc.json: tpr
+ x:
+ precision_recall.json: recall
+ roc.json: fpr
+ - confusion: # different x and y paths
+ y:
+ dir/preds.csv: predicted
+ x:
+ dir/actual.csv: actual
+ template: confusion
+ ```
- `y_label` (_string_) - Y axis label. If all `y` data sources have the same
field name, that will be the default. Otherwise, it's "y".
@@ -175,10 +160,8 @@ to [DVCLive] for a helper to log plots.
- `x_label` (_string_) - X axis label. If all `y` data sources have the same
field name, that will be the default. Otherwise, it's "x".
-- `title` (_string_) - header for the plot(s). Defaults:
-
- - **Top-level plots**: `path/to/dvc.yaml::plot_id`
- - **Plot outputs**: `path/to/data.csv`
+- `title` (_string_) - header for the plot(s). Defaults to
+ `path/to/dvc.yaml::plot_id`.
- `template` (_string_) - [plot template]. Defaults to `linear`.
@@ -235,7 +218,7 @@ them).
-Output files may be viable data sources for [top-level plots](#plots).
+Output files may be viable data sources for [plots](#plots).
@@ -352,48 +335,6 @@ See also `dvc params diff` to compare params across project version.
-### Metrics and Plots outputs
-
-
-
-Metrics and plots outputs described below come from earlier versions of DVC and
-remain as a convenience. You can instead define metrics and plots separate from
-your pipeline with [DVCLive] or add "top-level" [metrics](#metrics) and
-[plots](#plots). You can optionally include them as regular `outs` in the
-pipeline.
-
-
-
-Like common outputs, metrics and plots files are
-produced by the stage `cmd`. However, their purpose is different. Typically they
-contain metadata to evaluate pipeline processes. Example:
-
-```yaml
-stages:
- build:
- cmd: python train.py
- deps:
- - features.csv
- outs:
- - model.pt
- metrics:
- - accuracy.json:
- cache: false
- plots:
- - auc.json:
- cache: false
-```
-
-
-
-`cache: false` is typical here, since they're small enough for Git to store
-directly.
-
-
-
-The commands in `dvc metrics` and `dvc plots` help you display and compare
-metrics and plots.
-
## Stage entries
These are the fields that are accepted in each stage:
@@ -405,8 +346,6 @@ These are the fields that are accepted in each stage:
| `deps` | List of dependency paths (relative to `wdir`). |
| `outs` | List of output paths (relative to `wdir`). These can contain certain optional [subfields](#output-subfields). |
| `params` | List of parameter dependency keys (field names) to track from `params.yaml` (in `wdir`). The list may also contain other parameters file names, with a sub-list of the param names to track in them. |
-| `metrics` | List of [metrics files](/doc/command-reference/metrics), and optionally, whether or not this metrics file is cached (`true` by default). See the `--metrics-no-cache` (`-M`) option of `dvc stage add`. |
-| `plots` | List of [plot metrics](/doc/command-reference/plots), and optionally, their default configuration (subfields matching the options of `dvc plots modify`), and whether or not this plots file is cached ( `true` by default). See the `--plots-no-cache` option of `dvc stage add`. |
| `frozen` | Whether or not this stage is frozen (prevented from execution during reproduction) |
| `always_changed` | Causes this stage to be always considered as [changed] by commands such as `dvc status` and `dvc repro`. `false` by default |
| `meta` | (Optional) arbitrary metadata can be added manually with this field. Any YAML content is supported. `meta` contents are ignored by DVC, but they can be meaningful for user processes that read or write `.dvc` files directly. |