Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

saving/loading models #75

Open
smissan opened this issue Feb 24, 2017 · 8 comments
Open

saving/loading models #75

smissan opened this issue Feb 24, 2017 · 8 comments

Comments

@smissan
Copy link

smissan commented Feb 24, 2017

Guys,

could you point to the example on how to save/restore model state (weights, not structure)

thanks

@peakmeans
Copy link

alexnet.cpp and charRNN.cpp have,but error.
can U tell me how to do.
thanks all.

@mz24cn
Copy link
Contributor

mz24cn commented Feb 27, 2017

what's error in charRNN.cpp ?

@peakmeans
Copy link

I had done this test:
1, write a mlp Symbol;

2,,before saving paramters, I write these:
/////////////////////////////////////////////////////////////
std::map<std::string, NDArray> args_map;
args_map["data"] = NDArray(Shape(batch_size, 1, 10, 10), ctx_dev, false);
args_map["data_label"] = NDArray(Shape(batch_size), ctx_dev, false);
testSymbol.InferArgsMap(ctx_dev, &args_map, args_map);
/////////////////////////////////than I define function base charRNN.cpp//////////////
void SaveParams(const string filepath, Executor* exe)
{
map<string, NDArray> params;
for (auto iter : exe->arg_dict())
if (iter.first.find("init") == string::npos
&& iter.first.rfind("data") != iter.first.length() - 4
&& iter.first.rfind("label") != iter.first.length() - 5)
params.insert({ "arg:" + iter.first, iter.second });
for (auto iter : exe->aux_dict())
params.insert({ "aux:" + iter.first, iter.second });
NDArray::Save(filepath, params);
}
/////////////////////////after train,I save parameters like this.////////////////////////
string save_path_param = "./test.params";
SaveParams(save_path_param, exe);

3,load paramters like this:
/////////////////////////////////////////////////////////////////////////////////////////
SymbolHandle testSymbolhandle;
MXSymbolCreateFromFile("test.json", &testSymbolhandle);
Symbol testSymbol(testSymbolhandle);
map<string, NDArray> args_map;
Context ctx_dev(DeviceType::kGPU, 0);
args_map["data"] = NDArray(Shape(1, 1, 10, 10), ctx_dev, false);
args_map["data_label"] = NDArray(Shape(1), ctx_dev, false);
Executor* exe = testSymbol.SimpleBind(ctx_dev, args_map);
LoadToMapFromFile("test.params", exe, ctx_dev);
/////////////////////////////////than I define function base charRNN.cpp//////////////
void LoadToMapFromFile(const string filepath, Executor* exe, Context ctx_dev)
{
/map<std::string, NDArray> params = NDArray::LoadToMap(filepath);
for (auto iter : params)
{//also error.
string type = iter.first.substr(0, 4);
string name = iter.first.substr(4);
NDArray target;
if (type == "arg:")
target = exe->arg_dict()[name];
else if (type == "aux:")
target = exe->aux_dict()[name];
else
continue;
iter.second.CopyTo(&target);
}
/

map<string, NDArray> paramters;
NDArray::Load(filepath, 0, &paramters);
for (const auto &k : paramters) 
{
	if (k.first.substr(0, 4) == "aux:") 
	{
		auto name = k.first.substr(4, k.first.size() - 4);
		exe->arg_dict()[name] = k.second.Copy(ctx_dev);
	}
	if (k.first.substr(0, 4) == "arg:") 
	{
		auto name = k.first.substr(4, k.first.size() - 4);
		exe->arg_dict()[name] = k.second.Copy(ctx_dev);
	}
}
/*WaitAll is need when we copy data between GPU and the main memory*/
NDArray::WaitAll();

}

4,error like these:
[21:55:46]d:\xxxx\dmlc/logging.h:300: [21:55:46] d:\xxx\ndarray.cc:687:check failed:dshape.Size()==(100 vs. 400) Memory size do not match

I sure that: json for net frame,have no error;but weight parameters don't work well .I didn't know what happen.
now,can you tell me how to do?if I had error? thank you very much!

@peakmeans
Copy link

@mz24cn Are U chinese?我英文不太好。要是两个中国人在本土上用英文来探究问题,感觉怪怪的,哈哈哈。
上面这个问题,我初步认为是存储有问题。
charRNN的SaveCheckpoint能存储参数,但是里面的数字,4、5啊我不清楚有什么用。这个SaveCheckpoint存储的参数,在predict的MXPredCreate()是不能用的,可能SaveCheckpoint和真正的存储存参数实现不一样,但我不清楚里面有什么差异。
另外,我把charRNN中成套的存储加载过程复制了,依旧跑不通,就是上面的问题。成套的存储和加载,按理应该没问题,但实际上是有点问题。

你们用mxnet平台的C++接口实现了几个重要的网络结构,这个工作非常棒!!像caffe上的用prototxt配置文件实现的resnet真是。。。。你们这个用程序逻辑实现的网络结构简洁明了。而且还有rnn。

@peakmeans
Copy link

@mz24cn 建议你们:
1、#include "mxnet-cpp/MxNetCpp.h"这个头文件呢,现在只能是一个文件中跑通,要是其他文件再次调用这个文件就会弹错。这问题不知道怎么回事,能解决吗?这个可能会影响很多人的,尤其是用C++拿这个平台做各种实验的人。
2、能提供python中的部分强化学习的C++实现样例吗?这样我们可以在转换语言的时候省点时间。

@mz24cn
Copy link
Contributor

mz24cn commented Feb 28, 2017

charRNN‘s Load/Save is only used as demo for the example. It cannot fit for other examples, nor python predict API except that the net model and parameter names are same as the C++ code.

1,charRNN的Load/Save只是给这个例子用的演示,并不能通用于其他例子,或者python模型的数据。
2,Load/Save显然只需要保存weight/bias之类的参数,至于data/label当然不必保存,要排除(包括RNN的init状态值)。mxnet默认是通过数据名称约定识别是参数(weight结尾)还是数据(如label结尾)。这就是数字4,5的用处。
3,charRNN的Load/Save用于自己运行生成和加载,是测试过没问题的。注意运行help信息里的Note提示。
4,你说的其他文件再次调用这个文件弹错我不太明白,我看mxnetcpp.h也用MXNETCPP_H_保护起来避免重定义了,按理不会有错。
5,其他C++样例有待各位贡献哦~~

@smissan
Copy link
Author

smissan commented Feb 28, 2017

if you can point us to mxnet C api to achieve the same result, it would be helpful, too

@peakmeans
Copy link

@smissan in src/c_api/c_predict_api.cc line 115,I can find load parameters,this can load paramters that saved parameters by python after train. but I can't find the saving stage that use c/c++.
@mz24cn 谢谢。1-3,了解了。关于load通用的params参数,我直接用现成的(c_predict_api.cc实现的),对应save的话,我要根据你说的再去试一下,如果你提供save的实现例子是最好不过了。
4,我的情形是一个main.cpp入口,多个.h和.cpp,多个.h中引用了mxnetcpp.h。弹错显示的是重复定义。这个还不算太迫切,迫切的是paramters的load和save的功能。我有时间再去确认下,mxnetcpp.h头文件引用到底是不是有问题。
5,如果你嫌麻烦的话。我是打算将你们提供的强化学习python代码转为C/C++代码。我时间不定,纯属爱好。如果实现了,我可能传到githup上留个备份。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants