Training error with deep auto encoder #158

sinaakhtar · 2018-02-21T08:30:19Z

Hi, I am running a deep auto encoder using DSSTNE but having issues with the training. The first epoch returns a training error of > 0, and the second epoch results in 0.

NNNetwork::Train: Epoch 1, average error 42600148992.000000, average training error 16.304724, average regularization error 42600148992.000000, elapsed time 461.341889s
NNNetwork::Train: Epoch 2, average error 42742050816.000000, average training error 0.000000, average regularization error 42742050816.000000, elapsed time 462.598641s

Any clues as to why this might be happening? One of the configs I tried is:

{
"Version" : 0.8,
"Name" : "AE",
"Kind" : "FeedForward",

"SparsenessPenalty" : {
    "p" : 0.5,
    "beta" : 2.0
},

"ShuffleIndices" : true,

"Denoising" : {
    "p" : 0.2
},

"ScaledMarginalCrossEntropy" : {
    "oneTarget" : 1.0,
    "zeroTarget" : 0.0,
    "oneScale" : 1.0,
    "zeroScale" : 1.0
},
"Layers" : [
    { "Name" : "Input", "Kind" : "Input", "N" : "auto", "DataSet" : "gl_input", "Sparse" : true },
    { "Name" : "Hidden1", "Kind" : "Hidden", "Type" : "FullyConnected", "N" :512 , "Activation" : "ScaledExponentialLinear", "Sparse" : false},
	{ "Name" : "Hidden2", "Kind" : "Hidden", "Type" : "FullyConnected", "N" :512 , "Activation" : "ScaledExponentialLinear", "Sparse" : false },
	{ "Name" : "Hidden3", "Kind" : "Hidden", "Type" : "FullyConnected", "N" :1024 , "Activation" : "ScaledExponentialLinear", "Sparse" : false,"pDropout" :0.8 },
	{ "Name" : "Hidden4", "Kind" : "Hidden", "Type" : "FullyConnected", "N" :512 , "Activation" : "ScaledExponentialLinear", "Sparse" : false },
	{ "Name" : "Hidden5", "Kind" : "Hidden", "Type" : "FullyConnected", "N" :512 , "Activation" : "ScaledExponentialLinear", "Sparse" : false },
    { "Name" : "Output", "Kind" : "Output", "Type" : "FullyConnected", "DataSet" : "gl_output", "N" : "auto", "Activation" : "Sigmoid", "Sparse" : true }
],

"ErrorFunction" : "ScaledMarginalCrossEntropy"

}

The text was updated successfully, but these errors were encountered:

ekandrotA9 · 2018-05-11T22:27:12Z

Actually, I have seen something like this recently while adding Batch Norm to DSSTNE. I had the learning rate sign backwards, a positive instead of a negative when using the results from cuDNN. Flipping the sign on the learning then caused it to converge. Until I did that, I had the results just like yours.

So, I'm guessing that it is similar - that it grows without bounds, caps out, then can not change values any more, so average training error drops to zero (ie because they can not change). How are you invoking DSSTNE? Do you have your own main.cpp, or are you using encoder built from main.cpp in utils?

scottlegrand · 2018-06-12T18:58:27Z

I would turn on verbose mode and watch per-minibatch training error to see when it blows up.

spacelover1 · 2020-03-24T09:54:03Z

Hello @scottlegrand,

I would like to know if I can use other Error Functions, such as RMSE in the config.json file? I couldn't find anything here about the Error function.
Also would you please tell me where I can add the verbose mode?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training error with deep auto encoder #158

Training error with deep auto encoder #158

sinaakhtar commented Feb 21, 2018 •

edited

Loading

ekandrotA9 commented May 11, 2018

scottlegrand commented Jun 12, 2018

spacelover1 commented Mar 24, 2020 •

edited

Loading

Training error with deep auto encoder #158

Training error with deep auto encoder #158

Comments

sinaakhtar commented Feb 21, 2018 • edited Loading

ekandrotA9 commented May 11, 2018

scottlegrand commented Jun 12, 2018

spacelover1 commented Mar 24, 2020 • edited Loading

sinaakhtar commented Feb 21, 2018 •

edited

Loading

spacelover1 commented Mar 24, 2020 •

edited

Loading