Loading Python Exported Model into TorchSharp #585
Replies: 3 comments
-
Hi @jimquittenton, Passing 'strict=false' is not a solution here, even if it didn't throw an exception (that seems like a bug to me), it would just end up with a partially loaded model, or nothing loaded at all. A particular network architecture can be constructed many ways, and the names given to the various layers will depend on the details of how it is constructed. You will have exactly match the Python construction logic in the TorchSharp code, or the weights won't have the same names, as you have discovered. The examples code is tricky, because it is used to construct a number of different ResNet architectures and uses loops and such to construct it. Another trickiness of the example is that it is built to serve as an example and uses the "toy" CIFAR data set. The input image size and the number of classes may not match what your ResNet18 model was trained on (if I'm stating the obvious, please excuse me). The exception could be because the C# tensors may have different dimensions that the serialized ones, I would have to take a look at that. We may need to change the serialization format to catch issues like that, if that is indeed the case. |
Beta Was this translation helpful? Give feedback.
-
BTW, I released 0.96.6 on Saturday morning. It has the correct ResNetNN (and a few others) architectures. |
Beta Was this translation helpful? Give feedback.
-
Thank you Niklas, I'll try this out.
…On Mon, May 16, 2022 at 9:24 PM Niklas Gustafsson ***@***.***> wrote:
BTW, I released 0.96.6 on Saturday morning. It has the correct ResNetNN
(and a few others) architectures.
—
Reply to this email directly, view it on GitHub
<#585 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AHECHSGZ6QLHAWGMM6QV763VKKVHLANCNFSM5UO3RGXQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hi,
I'm new to TorchSharp and am having trouble loading a python trained ResNet18 model. I've been following this article: https://github.com/dotnet/TorchSharp/blob/main/docfx/articles/saveload.md and have exported my python model using the 'save_state_dict' function in this script: https://github.com/dotnet/TorchSharp/blob/main/src/Python/exportsd.py .
In TorchSharp I have copied the ResNet model from https://github.com/dotnet/TorchSharpExamples/blob/main/src/CSharp/Models/ResNet.cs and then call the following:
The load() line throws an exception with message
Mismatched module state names: the target modules does not have a submodule or buffer named 'conv1.weight'
.If I examine the state_dict from 'myModel' prior to load(), it contains entries like:
whereas the corresponding entries prior to saving from python are:
I tried amending the
ResNet.cs
code to reflect the python names, but could not get them to exactly match.I also tried calling load() with strict=false
myModel.load(mPath, false);
. This seemed to get past the Mismatched names exception, but throws another exception with messageToo many bytes in what should have been a 7 bit encoded Int32.
I've been struggling with this for a couple of days now so would really appreciate any help you guys could offer.
Thanks
Jim
Beta Was this translation helpful? Give feedback.
All reactions