Skip to content

(TG) TG model perf tests #467

(TG) TG model perf tests

(TG) TG model perf tests #467

Manually triggered November 25, 2024 07:05
Status Cancelled
Total duration 44m 50s
Artifacts 2

tg-model-perf-tests.yaml

on: workflow_dispatch
build-artifact-profiler  /  ...  /  build-docker-image
22s
build-artifact-profiler / build-docker-image / build-docker-image
Matrix: build-artifact-profiler / build-artifact
Matrix: tg-model-perf-tests / tg-model-perf-tests
Fit to window
Zoom out
Zoom in

Annotations

9 errors, 6 warnings, and 4 notices
tg-model-perf-tests / TG CNN model perf tests
The runner has received a shutdown signal. This can happen when the runner service is stopped, or a manually started runner is canceled.
pcie-cards-are-being-used-cleanup
Tenstorrent cards seem to be in use. Killing PIDs and exiting unsuccessfully. This can happen if a test hung and is normally an issue with the test, rather than infra.
tg-model-perf-tests / TG CNN model perf tests
The operation was canceled.
tg-model-perf-tests / TG CNN model perf tests
Process completed with exit code 1.
tg-model-perf-tests / TG LLM model perf tests
The run was canceled by @Aswinmcw.
pcie-cards-are-being-used-cleanup
Tenstorrent cards seem to be in use. Killing PIDs and exiting unsuccessfully. This can happen if a test hung and is normally an issue with the test, rather than infra.
tg-model-perf-tests / TG LLM model perf tests
The operation was canceled.
tg-model-perf-tests / t3k CCL all_gather perf tests
The run was canceled by @Aswinmcw.
tg-model-perf-tests / t3k CCL all_gather perf tests
Process completed with exit code 1.
unsuccessful-reset-attempt-cleanup
Unsuccessful board reset, trying again in 1 minute ...
tg-model-perf-tests / TG LLM model perf tests
Runner g14cs01 did not respond to a cancelation request with 00:05:00.
unsuccessful-reset-attempt-cleanup
Unsuccessful board reset, trying again in 1 minute ...
unsuccessful-reset-attempt-cleanup
Unsuccessful board reset, trying again in 1 minute ...
unsuccessful-reset-attempt-cleanup
Unsuccessful board reset, trying again in 1 minute ...
tg-model-perf-tests / t3k CCL all_gather perf tests
Runner g14cs03 did not respond to a cancelation request with 00:05:00.
printing-out-smi-info-cleanup
Touching and printing out SMI info
attempting-reset-cleanup
Attempting to reset card(s).
printing-out-smi-info-cleanup
Touching and printing out SMI info
printing-out-smi-info-cleanup
Touching and printing out SMI info

Artifacts

Produced during runtime
Name Size
TTMetal_build_wormhole_b0_profiler
305 MB
perf-report-csv-CNN-wormhole_b0-
348 Bytes