-
Notifications
You must be signed in to change notification settings - Fork 417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hidden factors paper demo #13
base: master
Are you sure you want to change the base?
Conversation
Looks good in general! Some comments (the diff is too long for github to display, so I'll comment here):
Why don't you cast the data before turning it into a shared variable? I used
I wouldn't officially encourage to leave out the
You set the batch size to 100, but in the end you compile
Those pesky names again :) But it's your choice, I don't mind too much.
Cool, I like that! Before And just to make it a little prettier, you could alter the numbers so Very nice work! |
This carries over from the Theano deep learning tutorials, where they also have the labels as float and use T.cast to convert them to integers at runtime: http://deeplearning.net/tutorial/logreg.html#logreg This is because supposedly you can only store floats on the GPU (the labels would be stored in host memory otherwise). Are you saying this is not the case? (no longer the case maybe?)
I don't think stuff in Recipes counts as 'official' in that sense though :) But fair enough, I can add them explicitly. In that case we should probably consider updating all Recipes content to match, I'm pretty sure there are already a few other examples that rely on rectify as the default.
It's all dense layers, so I figured it wouldn't be a problem. But you're right in that it doesn't really set a good example. I'll change it.
What's wrong with these names? I think they are there because I was mimicing the old MNIST example code. What should I call them instead?
Good call.
Will take care of that when everything else has been addressed. |
I thought so. But you're right: In [3]: type(theano.shared(np.zeros(1)))
Out[3]: theano.tensor.sharedvar.TensorSharedVariable
In [4]: type(theano.shared(np.zeros(1, dtype='float32')))
Out[4]: theano.sandbox.cuda.var.CudaNdarraySharedVariable
In [5]: type(theano.shared(np.zeros(1, dtype='uint8')))
Out[5]: theano.tensor.sharedvar.TensorSharedVariable
In [6]: type(theano.tensor.cast(theano.shared(np.zeros(1, dtype='float32')), 'int32'))
Out[6]: theano.tensor.var.TensorVariable In the MNIST example it was different, there I just needed something I can hand to a compiled function, and
Hmm, you're right, the spatial transformer network has it explicitly specified for the classification network, but not for the localization network. The caffe/CIFAR10 example doesn't specify it. Well... maybe just leave it as it is.
Maybe it's just me, but they remind me of Python's
Exactly! It doesn't die! :)
For me, |
I will try to finish this soon, just a bit occupied with other things at the moment. I haven't abandoned this PR! |
Let me know if there's anything I can help with to get this PR merged :) |
just f0k's comments that need to be addressed. I should have some time to sort this out soon, but feel free to do it if you want this to be merged sooner :) |
This is an updated version of the paper "Discovering hidden factors of variation in deep networks" by @briancheung et al. (http://arxiv.org/abs/1412.6583), which no longer relies on any external code. The previous version relied on an old version of the MNIST example.
The other notebook I did (Highway Networks) is coming soon as well :)