You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[New Thread 0x7f37c96c5000 (LWP 38397)]
[New Thread 0x7f37a12e0000 (LWP 38398)]
[Detaching after fork from child process 38399]
[Thread 0x7f37c96c5000 (LWP 38397) exited]
Thread 1 "kernel.testapp" received signal SIGURG, Urgent I/O condition.
[Switching focus to CUDA kernel 0, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 0, warp 0, lane 0]
0x00007f37a2fae0a0 in kernel(unsigned short*, unsigned short*, unsigned short*, char, short, int, long long)<<<(1,1,1),(32,1,1)>>> ()
[DEBUG] gdb_init out.
Cuda api initialized and attached!
got API
Device "NVIDIA A100-SXM4-80GB":
index: 0
type: "GA100GL-A"
SM type: "sm_80"
lanes: 32
predicates 8
registers: 255
SMs: 108
warps: 64
checkpointing kernel with name: "_Z6kernelPtS_S_csix"
stack-size: 336, param-addr: 352, param-size: 40, param-num: 7
SM 0: 1 - 0000000000000000000000000000000000000000000000000000000000000001
03/12/23 10:42:43.888585 DEBUG: relative 6a0, virtual 7f37a2fae0a0 in /cricket-cr.c(670)
SM 0 warp 0 (active): 55555555 - 01010101010101010101010101010101
SM 0 warp 0 (valid): ffffffff - 11111111111111111111111111111111
03/12/23 10:42:43.888637 DEBUG: function "_Z6kernelPtS_S_csix" has no room (0 slots) in /cricket-cr.c(831)
03/12/23 10:42:43.888647 ERROR: There is no room in the top level function (i.e. the kernel). This kernel can thus never be restored! in /cricket-cr.c(835)
cricket-checkpoint: could not make checkpointable.
Thread 1 "kernel.testapp" received signal SIGURG, Urgent I/O condition.
0x00007ffed552baea in clock_gettime ()
[Inferior 1 (process 38378) detached]
The text was updated successfully, but these errors were encountered:
This is related to GPU kernel checkpointing. This is currently not working - but we are working on repairing this. (see #19 )
Note that most likely you don't need this. The code in the cpu subdirectory can also do checkpoints.
When I use the test script for c/r.
The output is
The text was updated successfully, but these errors were encountered: