Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All weights set to NaN #23

Open
ghost opened this issue Jul 14, 2015 · 6 comments
Open

All weights set to NaN #23

ghost opened this issue Jul 14, 2015 · 6 comments

Comments

@ghost
Copy link

ghost commented Jul 14, 2015

Hello,

First of all, thanks a lot for sharing your work, it is really interesting.

About my problem, I'm trying to train a CNN using RGB images as input. My training set has a size of kXSize = 33x33x3x55000. I use a typical architecture I,C,S,C,S,C,S,F,F, no modifications to any hyperparameter. I define the input layer as :
struct('type', 'i', 'mapsize', kXSize(1:2), 'outputmaps', kXSize(3)).
Input matrices are normalized and correctly defined.

Problem is, after the first epoch of training, I keep getting a vector of NaN weights. Would you have any idea why weights are not computed properly? It seems I don't specify the input properly, but I can't find out where, as your program runs 'normally' (it takes about 1500sec to do 1 epoch of training).

Thanks for your attention,

Quentin

@sdemyanov
Copy link
Owner

Hi Quentin,

I suppose the reason might be in too large learning rate, or weight
initialization. Try to decrease the rate and increase/decrease the weight
variance ('initstd' parameter). It should help.

Regards,
Sergey.

On Wed, Jul 15, 2015 at 1:02 AM, Bardeux [email protected] wrote:

Hello,

First of all, thanks a lot for sharing your work, it is really
interesting.

About my problem, I'm trying to train a CNN using RGB images as input. My
training set has a size of kXSize = 33x33x3x55000. I use a typical
architecture I,C,S,C,S,C,S,F,F, no modifications to any hyperparameter. I
define the input layer as :
struct('type', 'i', 'mapsize', kXSize(1:2), 'outputmaps', kXSize(3)).

Problem is, after an epoch of training, I keep getting a vector of NaN
weights. Would you have any idea why weights are not computed properly? It
seems I don't specify the input properly, but I can't find out where, as
your program runs 'normally' (it takes about 1500sec to do 1 epoch of
training).

Thanks for your attention,

Quentin


Reply to this email directly or view it on GitHub
#23.

PhD candidate,
Computing and Information Systems,
The University of Melbourne.

Sergey Demyanov http://www.demyanov.net/

@ghost
Copy link
Author

ghost commented Jul 15, 2015

Hi Sergey,

Thanks for your answer, it seems indeed that the CNN wasn't converging. I have some decent weights now (I think) after tuning down the alpha and momentum parameters, but I just got another problem...

My CNN is a simple classifier (output of size 1). My training and testing sets have half examples of positive, and half negative. Problem is after training, the cnntest function returns a prediction vector with only ones, and it is accepted as a valid answer for every sample. I have a 0% error eventhough my testing set contains negative examples. I can't understand why the CNN acknowledges only 'positive' as a response, as my sets are evenly distributed.

Sorry for bothering you with this, I'm still not used to neurals networks.

Thanks again,

Quentin

@sdemyanov
Copy link
Owner

Hi Quentin,

I guess your problem now is that you use the softmax output layer, which is
supposed to output probabilities. You either need to change the type of the
nonlinear function on the last layer and the loss function on 'squared',
or, more preferably, change the format of your labels - output size 2, each
column correspond to a particular class, all values are zeros expect one
'1' for the right class.

Regards,
Sergey.

On Wed, Jul 15, 2015 at 9:36 PM, Bardeux [email protected] wrote:

Hi Segey,

Thanks for your answer, it seems indeed that the CNN wasn't converging. I
have some decent weights now (I think) after tuning down the alpha and
momentum parameters, but I just got another problem...

My CNN is a simple classifier (output of size 1). My training and testing
sets have half examples of positive, and half negative. Problem is after
training, the cnntest function returns a prediction vector with only ones,
and it is accepted as a valid answer for every sample. I have a 0% error
eventhough my testing set contains negative examples. I can't understand
why the CNN acknowledges only 'positive' as a response, as my sets are
evenly distributed.

Sorry for bothering you with this, I'm still not used to neurals networks.

Thanks again,

Quentin


Reply to this email directly or view it on GitHub
#23 (comment).

@ghost
Copy link
Author

ghost commented Jul 15, 2015

Hi Sergey,

I'm ashamed I didn't realize this. I really need to work more on neural networks.

Thanks a lot for your help, it was extremely useful.

Good luck with your future plans,

Best regards,

Quentin

@sdemyanov
Copy link
Owner

No problem, thank you)

On Wed, Jul 15, 2015 at 11:21 PM, Bardeux [email protected] wrote:

Hi Sergey,

I'm ashamed I didn't realize this. I really need to work more on neural
networks.

Thanks a lot for your help, it was extremely useful.

Good luck with your future plans,

Best regards,

Quentin


Reply to this email directly or view it on GitHub
#23 (comment).

@MosTec1991
Copy link

Hi

RGB or Gray image have range 0 - 255 for each pixel. to fix this problem you must have range 0 - 1 for each pixel. try this line of code when you import your data:

MyPicData=MyPicData/255;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants