You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ubuntu 22.04
on main branch with this commit: 3aaf87b
installed via docker
I was trying out the advanced_worm.py example and it was all working fine. At some point I stopped the script half way through. When I tried to run it again I got the error:
Traceback (most recent call last):
File "/workspaces/genesis/examples/tutorials/advanced_worm.py", line 9, in <module>
scene = gs.Scene(
^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/genesis/utils/misc.py", line 27, in new_init
original_init(self, *args, **kwargs)
File "/opt/conda/lib/python3.11/site-packages/genesis/engine/scene.py", line 133, in __init__
self._sim = Simulator(
^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/genesis/engine/simulator.py", line 94, in __init__
self.rigid_solver = RigidSolver(self.scene, self, self.rigid_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/genesis/engine/solvers/rigid/rigid_solver_decomp.py", line 24, in __init__
super().__init__(scene, sim, options)
File "/opt/conda/lib/python3.11/site-packages/genesis/engine/solvers/base_solver.py", line 18, in __init__
self._gravity.from_numpy(np.array(options.gravity, dtype=gs.np_float))
File "/opt/conda/lib/python3.11/site-packages/taichi/lang/util.py", line 351, in wrapped
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/taichi/lang/matrix.py", line 1353, in from_numpy
self._from_external_arr(arr)
File "/opt/conda/lib/python3.11/site-packages/taichi/lang/util.py", line 351, in wrapped
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/taichi/lang/matrix.py", line 1337, in _from_external_arr
ext_arr_to_matrix(arr, self, as_vector)
File "/opt/conda/lib/python3.11/site-packages/taichi/lang/kernel_impl.py", line 1113, in wrapped
return primal(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/taichi/lang/kernel_impl.py", line 1043, in __call__
key = self.ensure_compiled(*args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/taichi/lang/kernel_impl.py", line 1011, in ensure_compiled
self.materialize(key=key, args=args, arg_features=arg_features)
File "/opt/conda/lib/python3.11/site-packages/taichi/lang/kernel_impl.py", line 637, in materialize
self.runtime.materialize()
File "/opt/conda/lib/python3.11/site-packages/taichi/lang/impl.py", line 471, in materialize
self.materialize_root_fb(not self.materialized)
File "/opt/conda/lib/python3.11/site-packages/taichi/lang/impl.py", line 406, in materialize_root_fb
root.finalize(raise_warning=not is_first_call)
File "/opt/conda/lib/python3.11/site-packages/taichi/_snode/fields_builder.py", line 170, in finalize
return self._finalize(raise_warning, compile_only=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/taichi/_snode/fields_builder.py", line 182, in _finalize
return SNodeTree(_ti_core.finalize_snode_tree(_snode_registry, self.ptr, impl.get_runtime().prog, compile_only))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: [cuda_driver.h:operator()@92] CUDA Error CUDA_ERROR_NOT_SUPPORTED: operation not supported while calling malloc_async_impl (cuMemAllocAsync)
[Genesis] [21:28:17] [INFO] 💤 Exiting Genesis and caching compiled kernels...
I figured I had probably run out of memory or some GPU resource was busy, so I restarted the machine and was able to run the example again. I tried running the differentialable_push.py example and got the same error. I tried the restarting trick, but that didn't fix it.
I've tried deleting the docker image, restarting etc. but that didn't seem to help. After a long time I was able to get the examples running again, and found that if I got that error and waited a couple of minutes that the error would go away. At the moment I am getting the error and it doesn't not seem to be going away any more.
After waiting a bit more, the examples work again. It seems like some async process takes a while to release resources even if I force kill the genesis code. I tried restarting the code every 3 mins or so, and after about 10 mins the example would stop throwing the above error and run.
Ubuntu 22.04
on main branch with this commit:
3aaf87b
installed via docker
I was trying out the
advanced_worm.py
example and it was all working fine. At some point I stopped the script half way through. When I tried to run it again I got the error:I figured I had probably run out of memory or some GPU resource was busy, so I restarted the machine and was able to run the example again. I tried running the
differentialable_push.py
example and got the same error. I tried the restarting trick, but that didn't fix it.I've tried deleting the docker image, restarting etc. but that didn't seem to help. After a long time I was able to get the examples running again, and found that if I got that error and waited a couple of minutes that the error would go away. At the moment I am getting the error and it doesn't not seem to be going away any more.
This seems related:
taichi-dev/taichi#8395
My current theory is that I stopped execution during the middle of compiling the kernels and now some cache is dirty?
Do you know what this error is about?
Thanks.
The text was updated successfully, but these errors were encountered: