Loading snapshot RuntimeError: Can't set params because CPU buffer has the wrong size. #502
Unanswered
iordachelivia
asked this question in
Q&A
Replies: 1 comment 2 replies
-
Hi there, if you use It would of course be great if the snapshot functionality would automate away this conversion (permitting FullyFusedMLP to be used with snapshots from CutlassMLP), but alas that's not supported yet. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am trying to load on a local PC a snapshot trained on a remote PC (think completely different specs between the two, different CPUs, GPUs, everything).
I train the network on the remote PC with
python scripts/run.py --mode nerf --scene data/nerf/fox --save_snapshot saved/fox_10k.msgpack --train --n_steps 10000
And load the snapshot on the local PC
python3 scripts/run.py --mode nerf --load_snapshot saved/fox.msgpack --gui
I receive the following error
The error comes from dependencies/tiny-cuda-nn/include/tiny-cuda-nn/trainer.h the set_params method (called by deserialize).
I receive the same error whether I train on PC A and load on PC B or whether I train on PC B and load on PC A.
I found dump_parameters_as_images method in src/testbed.cu . I printed out the layer size values (first and second) from the method, between the two PCs with snapshots trained on each PC using the train command above
python scripts/run.py --mode nerf --scene data/nerf/fox --save_snapshot saved/fox_10k.msgpack --train --n_steps 10000
and received the following:I found a difference at layer 4, but I don't understand where exactly the difference comes from. Is there some dynamic parameter in the training that would make the same training command have different layer sizes on different PCs?
I found a similar issue here NVlabs/tiny-cuda-nn#6 and I do receive a warning on the local PC that
Warning: FullyFusedMLP is not supported for the selected architecture 61. Falling back to CutlassMLP. For maximum performance, raise the target GPU architecture to 75+.
.But I'm not sure how I can fix this.Is there some modification I can make to load the snapshot locally if it was trained on a different PC with different specs?
Beta Was this translation helpful? Give feedback.
All reactions