Complete all of the setup as mentioned in the setup doc.
opt-baselines \
-n 2 -g 8 \
-p test_v0 \
--model-size 125m \
--azure \
--checkpoints-dir "INSERT_YOUR_CHECKPOINT_DIR" \
--no-save-dir # Remove this if you want to print out full save-dir path
tensorboard serve --logdir="INSERT_TENSORBOARD_LOGDIR" --bind_all --port=6018
aim up --repo /path/to/log/dir --port 43800
Enable Aim to log metaseq training runs for snappy visualization and in-depth comparison.
Pass an extra flag to the train command --aim-repo /path/to/log/dir
Example train command with Aim enabled:
metaseq-train --task streaming_language_modeling \
data-bin/pile-00 \
--vocab-filename data-bin/pile-00/bpe-vocab.json \
--merges-filename data-bin/pile-00/bpe-merges.txt \
--criterion cross_entropy \
--batch-size 8 \
--save-dir /checkpoints/lm_transformer_pile-00 \
--arch transformer_lm --share-decoder-input-output-embed \
--dropout 0.1 \
--optimizer adam --weight-decay 0.01 --clip-norm 0.0 \
--lr 0.0005 --lr-scheduler inverse_sqrt --warmup-updates 4000 --warmup-init-lr 1e-07 \
--tokens-per-sample 1024 --sample-break-mode none --fp16 \
--aim-repo /path/to/log/dir
Once the training is started, open Aim UI to see the results:
aim up --repo /path/to/log/dir --port 43800
The following message will be outputted, meaning Aim UI is up and running:
Running Aim UI on repo `<Repo#-5930451821203570655 path=/.aim read_only=None>`
Press Ctrl+C to exit
Open your browser and navigate to
to explore tracked runs.
Depending on the setup host and port can be different.
Navigate to "Metrics Explorer" and select metrics to explore from the top left dropdown + Metrics
and click Search
- metaseq Aim arguments:
Arguments | Description |
aim_repo |
Defines the path to store collected training logs. If set to "." logs will be stored at cwd(current working directory). |
aim_run_hash |
Training run hash. If skipped creates or continues run based on "save_dir". Otherwise, stores training metadata in the specified run. |
- aim up arguments:
Arguments | Description |
--repo <repo_path> |
Path to stored logs - parent directory of .aim repo. Current working directory by default. |
-h | --host <host> |
Specify host address to run UI on. |
-p | --port <port> |
Specify port to listen to. |
See Aim full documentation at