(TG) TG model perf tests #475
tg-model-perf-tests.yaml
on: schedule
build-artifact-profiler
/
...
/
build-docker-image
1m 10s
Matrix: build-artifact-profiler / build-artifact
Matrix: tg-model-perf-tests / tg-model-perf-tests
Annotations
5 errors, 9 warnings, and 11 notices
tg-model-perf-tests / TG CNN model perf tests
Process completed with exit code 1.
|
pcie-cards-are-being-used-cleanup
Tenstorrent cards seem to be in use. Killing PIDs and exiting unsuccessfully. This can happen if a test hung and is normally an issue with the test, rather than infra.
|
tg-model-perf-tests / TG LLM model perf tests
Process completed with exit code 1.
|
tg-model-perf-tests / TG LLM model perf tests
Process completed with exit code 1.
|
tg-model-perf-tests / TG LLM model perf tests
The action 'Run model perf regression tests' has timed out after 60 minutes.
|
tg-model-perf-tests / t3k CCL all_gather perf tests
Failed to download action 'https://api.github.com/repos/tenstorrent/tt-metal/tarball/0390e0cf2c4b29d59fcc5bd3c61ca06caa69de7e'. Error: Resource temporarily unavailable (api.github.com:443)
|
tg-model-perf-tests / t3k CCL all_gather perf tests
Back off 27.062 seconds before retry.
|
tg-model-perf-tests / t3k CCL all_gather perf tests
Failed to download action 'https://api.github.com/repos/getsentry/action-setup-venv/tarball/a133e6fd5fa6abd3f590a1c106abda344f5df69f'. Error: Resource temporarily unavailable (api.github.com:443)
|
tg-model-perf-tests / t3k CCL all_gather perf tests
Back off 26.216 seconds before retry.
|
tg-model-perf-tests / TG LLM model perf tests
Failed to restore: getCacheEntry failed: Request timeout: /kE5pH1GYM3Yhxzhzfofu0B5IIUz8dzneBRSYuWcoacd9fWqJEP/_apis/artifactcache/cache?keys=setup-venv-Linux-py-3.8.18-%2Fhome%2Fubuntu%2Factions-runner%2F_work%2F_tool%2FPython%2F3.8.18%2Fx64%2Fbin%2Fpython-6e53e915dc6cae7bc216bca21416e65c2c37d74d62bc7e916a52ccd90b584ee7-.%2Fcreate_venv.sh&version=0f2a4d78a25b8dc6a98c7870cee2871c84b54ade7e9a0c38e3b80906041e7a71
|
unsuccessful-reset-attempt-cleanup
Unsuccessful board reset, trying again in 1 minute ...
|
unsuccessful-reset-attempt-cleanup
Unsuccessful board reset, trying again in 1 minute ...
|
unsuccessful-reset-attempt-cleanup
Unsuccessful board reset, trying again in 1 minute ...
|
unsuccessful-reset-attempt-cleanup
Unsuccessful board reset, trying again in 1 minute ...
|
printing-out-smi-info-cleanup
Touching and printing out SMI info
|
printing-out-smi-info-cleanup
Touching and printing out SMI info
|
successful-reset-cleanup
tt-smi reset was successful
|
reset-successful-cleanup
tt-smi reset was successful
|
printing-out-smi-info-cleanup
Touching and printing out SMI info
|
successful-reset-cleanup
tt-smi reset was successful
|
reset-successful-cleanup
tt-smi reset was successful
|
printing-out-smi-info-cleanup
Touching and printing out SMI info
|
attempting-reset-cleanup
Attempting to reset card(s).
|
successful-reset-cleanup
tt-smi reset was successful
|
reset-successful-cleanup
tt-smi reset was successful
|
Artifacts
Produced during runtime
Name | Size | |
---|---|---|
TTMetal_build_wormhole_b0_profiler
|
306 MB |
|
perf-report-csv--wormhole_b0--bare-metal
|
1.51 KB |
|
perf-report-csv-CNN-wormhole_b0-
|
521 Bytes |
|