Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of hyperbolic tangent for activation functoin #11

Open
younghwanoh opened this issue Oct 18, 2018 · 1 comment
Open

Use of hyperbolic tangent for activation functoin #11

younghwanoh opened this issue Oct 18, 2018 · 1 comment

Comments

@younghwanoh
Copy link

Hi,

Thanks you for open-sourcing this great idea !
I'm exploring your codes and doing some experiments with variants.
The first thing is activation function.
As written in BNN_cifar10.py, you used HardTanh function for activation function.
I don't see any description on your paper about this, though I'm not 100% confident, but I found this significantly affects to accuracy anyway.
Keeping ReLU with BNN as the full-precision counterpart does drops about 10% of top1 accuracy.

result

Do you have any insight about this?
Because, when stacking very deep networks, I heard that hyperbolic tangent for activation could be a bad idea.
I'm bit concerned about gradient vanishing problems, etc.
If you can share some experience about this, why did you use the specific hyperbolic tangent function and so on, I'd be very nice.

Thanks in advance
OYH

@itayhubara
Copy link
Owner

HardTanh is simply cliping the values to be between -1 and 1. Everything above 1 it sets to 1 and below -1 to -1, this helps the initial training phase. Since I used BN to normalize the input I know that most input's data would be in that area. After that I used the sign function which actually binarized the input. If you use the ReLU function you simply assign everything above 0 to 1 and the rest would be zero. Probably if you would clamp the relu values above 1 (same idea as relu6 only with 1 instead of 6) and use round function instead of sign you would get good results.
All the best, Itay

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants