We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I'm trying to enable watcher on all non-perf pipelines so that device-side issues reported by watcher can be caught sooner.
On my branch where I try to enable watcher, I see this error when running post-commit action:
https://github.com/tenstorrent-metal/tt-metal/actions/runs/8429687483/job/23084407648
2024-03-26T02:04:38.0322916Z tests/tt_eager/python_api_testing/unit_testing/misc/test_optimized_conv_v2.py::test_optimized_conv_v2[pack_l1-LoFi-activations_BFLOAT16-weights_BFLOAT8_B-8-128-128-28-28-3-3-1-1-1-1-True-True-False-False] �[38;2;000;128;000m Metal�[0m | �[1m�[38;2;100;149;237mINFO �[0m | Initializing device 0 2024-03-26T02:04:38.0684146Z �[38;2;000;128;000m Metal�[0m | �[1m�[38;2;100;149;237mINFO �[0m | AI CLK for device 0 is: 800 MHz 2024-03-26T02:04:38.0746331Z �[38;2;000;128;000m LLRuntime�[0m | �[1m�[38;2;100;149;237mINFO �[0m | Watcher log file: /home/ubuntu/actions-runner/_work/tt-metal/tt-metal/generated/watcher/watcher.log 2024-03-26T02:04:38.0749693Z �[38;2;000;128;000m LLRuntime�[0m | �[1m�[38;2;100;149;237mINFO �[0m | Watcher attached device 0 2024-03-26T02:04:38.0751749Z �[38;2;000;128;000m LLRuntime�[0m | �[1m�[38;2;100;149;237mINFO �[0m | Watcher thread watching... 2024-03-26T02:04:38.1134412Z 2024-03-26 02:04:38.113 | INFO | tests.tt_eager.python_api_testing.unit_testing.misc.test_optimized_conv_v2:test_optimized_conv_v2:160 - Conv output shape - [8, 28, 28, 128] 2024-03-26T02:05:38.0751917Z �[38;2;000;128;000m LLRuntime�[0m | �[1m�[38;2;100;149;237mINFO �[0m | Watcher checking device 0 2024-03-26T02:05:38.0954477Z terminate called after throwing an instance of 'std::runtime_error' 2024-03-26T02:05:38.0956670Z what(): Read 0xffffffff from ARC scratch[6]: auto-reset succeeded. 2024-03-26T02:05:38.0957520Z Fatal Python error: Aborted 2024-03-26T02:05:38.0965755Z 2024-03-26T02:05:38.0993090Z Thread 0x00007f2d1214e740 (most recent call first): 2024-03-26T02:05:38.0995270Z File "/home/ubuntu/actions-runner/_work/tt-metal/tt-metal/tt_eager/tt_dnn/op_library/sliding_window_op_infra/tt_py_composite_conv.py", line 1104 in copy_output_from_device 2024-03-26T02:05:38.0997252Z File "/home/ubuntu/actions-runner/_work/tt-metal/tt-metal/tests/tt_eager/python_api_testing/unit_testing/misc/test_optimized_conv_v2.py", line 233 in test_optimized_conv_v2 2024-03-26T02:05:38.0998926Z File "/home/ubuntu/python_env/lib/python3.8/site-packages/_pytest/python.py", line 195 in pytest_pyfunc_call
The text was updated successfully, but these errors were encountered:
@jliangTT need some help with figuring out who should own this bug as I don't see a clear "owner" for this file
Sorry, something went wrong.
another failed run https://github.com/tenstorrent-metal/tt-metal/actions/runs/8461487472/job/23181436986
tests/tt_eager/python_api_testing/unit_testing/misc/test_optimized_conv_v2.py
@tt-nshanker , is this the test case related to the 2.0 development?
@TT-billteng Can we close this issue ?
No branches or pull requests
I'm trying to enable watcher on all non-perf pipelines so that device-side issues reported by watcher can be caught sooner.
On my branch where I try to enable watcher, I see this error when running post-commit action:
https://github.com/tenstorrent-metal/tt-metal/actions/runs/8429687483/job/23084407648
The text was updated successfully, but these errors were encountered: