question about parameter 'biascoef' #20

kklots · 2015-06-05T05:19:40Z

Hi Sergey,

There is new problem in my work.
I found the parameter 'biascoef' may be not work well as described in the ReadME file.

This is my params and structure:

params.batchsize=128;
params.epochs = 1;
params.alpha = 0.1;
params.momentum = 0.9;
params.lossfun = 'logreg';
params.shuffle = 1;
params.seed = 0;
dropout = 0.5;

layers = {
struct('type', 'i', 'mapsize', kXSize(1:2), 'outputmaps', kXSize(3)) %32
struct('type', 'c', 'filtersize', [3 3], 'outputmaps', 32,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %30
struct('type', 's', 'scale', [2 2], 'function', 'max', 'stride', [2 2], 'dropout', dropout) %15
struct('type', 'c', 'filtersize', [2 2], 'outputmaps', 64,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %14
struct('type', 's', 'scale', [2 2], 'function', 'max', 'stride', [2 2], 'dropout',dropout) %7
struct('type', 'c', 'filtersize', [2 2], 'outputmaps', 64,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %6
struct('type', 's', 'scale', [2 2], 'function', 'max', 'stride', [2 2], 'dropout', dropout) %3
struct('type', 'c', 'filtersize', [2 2], 'outputmaps', 128,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %2
struct('type', 'f', 'length', 256, 'function','sigm', 'dropout', dropout, 'biascoef' ,0)
struct('type', 'f', 'length', kOutputs, 'function', 'soft', 'biascoef' ,0)
};

I have set all the ‘biascoef’ of the convolution layers and the full-connection layers to 0, but the loss and test prediction accuracy are still change after each training epoch.

As far as I know, the network weights will not change when all ‘biascoef’ params are set to 0, but it seems inconsistent with the situation here.

Yours,
Xuan Li.

sdemyanov · 2015-06-05T07:16:39Z

Hi, Xuan Li,

biascoef is just a coefficient that adapts learning rate for biases. If you
set it to 0, it means that biases remain 0 all time. This is it.

Regards,
Sergey.

On Fri, Jun 5, 2015 at 3:19 PM, lixuan [email protected] wrote:

Hi Sergey,

There is new problem in my work.
I found the parameter 'biascoef' may be not work well as described in the
ReadME file.

This is my params and structure:

params.batchsize=128;
params.epochs = 1;
params.alpha = 0.1;
params.momentum = 0.9;
params.lossfun = 'logreg';
params.shuffle = 1;
params.seed = 0;
dropout = 0.5;

layers = {
struct('type', 'i', 'mapsize', kXSize(1:2), 'outputmaps', kXSize(3)) %32
struct('type', 'c', 'filtersize', [3 3], 'outputmaps',
32,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %30
struct('type', 's', 'scale', [2 2], 'function', 'max', 'stride', [2 2],
'dropout', dropout) %15
struct('type', 'c', 'filtersize', [2 2], 'outputmaps',
64,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %14
struct('type', 's', 'scale', [2 2], 'function', 'max', 'stride', [2 2],
'dropout',dropout) %7
struct('type', 'c', 'filtersize', [2 2], 'outputmaps',
64,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %6
struct('type', 's', 'scale', [2 2], 'function', 'max', 'stride', [2 2],
'dropout', dropout) %3
struct('type', 'c', 'filtersize', [2 2], 'outputmaps',
128,'function','sigm', 'dropout', dropout, 'biascoef' ,0) %2
struct('type', 'f', 'length', 256, 'function','sigm', 'dropout', dropout,
'biascoef' ,0)
struct('type', 'f', 'length', kOutputs, 'function', 'soft', 'biascoef' ,0)
};

I have set all the biascoef of the convolution layers and the
full-connection layers to 0, but the loss and test prediction accuracy are
still change after each training epoch.

As far as I know, the network structure will not change when all
‘biascoef’ params are set to 0, but it seems inconsistent with the
situation here.

Yours,
Xuan Li.

—
Reply to this email directly or view it on GitHub
#20.

PhD candidate,
Computing and Information Systems,
The University of Melbourne.

Sergey Demyanov http://www.demyanov.net/

kklots · 2015-06-05T07:59:38Z

Thank you for your explanation.
Is there any method to set a special learning rate for each layer ?

sdemyanov · 2015-06-05T15:26:16Z

No, but it should be very easy to do. Just take a look how biascoef works,
and introduce another parameter for other weights.

On Fri, Jun 5, 2015 at 5:59 PM, lixuan [email protected] wrote:

Thank you for your explanation.
Is there any method to set a special learning rate for each layer ?

—
Reply to this email directly or view it on GitHub
#20 (comment).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about parameter 'biascoef' #20

question about parameter 'biascoef' #20

kklots commented Jun 5, 2015

sdemyanov commented Jun 5, 2015

kklots commented Jun 5, 2015

sdemyanov commented Jun 5, 2015

question about parameter 'biascoef' #20

question about parameter 'biascoef' #20

Comments

kklots commented Jun 5, 2015

sdemyanov commented Jun 5, 2015

kklots commented Jun 5, 2015

sdemyanov commented Jun 5, 2015