You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to build a custom RNN architecture and after banging my head against the Keras source code for a while I ended up here. recurrentshop is a super neat and helpful project and I think it will help me do what I'm wanting but I'm stuck.
I'm trying to build a network with the following architecture -
X - input
H - hidden state
Y - output
t - time step
Xt --> Ht is defined by a weight matrix Wxh - this is 'kernel' in the SimpleRNNCell
H_tm1 --> Ht is defined by a weight matrix Whh - this is 'recurrent_kernel' in the SimpleRNNCell
Ht --> Yt would be defined by a second layer and matrix Why, since I want it to be a secondary transformation and convert the hidden state dimension to a 1-dimensional output at each time step
Y_tm1 --> Ht is the hard part, defined by a matrix Wyh.
If I understand correctly the architecture is somewhat similar to your readout example. However I'd like to incorporate Y_tm1 into the state by treating it as a '1st class' input like so:
The readout example showed how to add or multiply X by the previous output Y, but I'd like to also learn the Wyh matrix. I think that means I need to include a new Dense() layer somewhere, but I am having a hard time figuring out how to do that. I am using this document as a start. I'd appreciate any help you could give! For reference, I tried re-writing the SimpleRNNCell class to include a two-part state (one for Ht and one for 'hidden' inside the cell) and ended up with a cryptic Keras error that I didn't understand.
The text was updated successfully, but these errors were encountered:
I'm trying to build a custom RNN architecture and after banging my head against the Keras source code for a while I ended up here. recurrentshop is a super neat and helpful project and I think it will help me do what I'm wanting but I'm stuck.
I'm trying to build a network with the following architecture -
X - input
H - hidden state
Y - output
t - time step
Xt --> Ht is defined by a weight matrix Wxh - this is 'kernel' in the SimpleRNNCell
H_tm1 --> Ht is defined by a weight matrix Whh - this is 'recurrent_kernel' in the SimpleRNNCell
Ht --> Yt would be defined by a second layer and matrix Why, since I want it to be a secondary transformation and convert the hidden state dimension to a 1-dimensional output at each time step
Y_tm1 --> Ht is the hard part, defined by a matrix Wyh.
If I understand correctly the architecture is somewhat similar to your readout example. However I'd like to incorporate Y_tm1 into the state by treating it as a '1st class' input like so:
Ht = K.dot(Xt, Wxh) + K.dot(H_tm1, Whh) + K.dot(Y_tm1, Wyh)
Ht = tanh(Ht)
The readout example showed how to add or multiply X by the previous output Y, but I'd like to also learn the Wyh matrix. I think that means I need to include a new Dense() layer somewhere, but I am having a hard time figuring out how to do that. I am using this document as a start. I'd appreciate any help you could give! For reference, I tried re-writing the SimpleRNNCell class to include a two-part state (one for Ht and one for 'hidden' inside the cell) and ended up with a cryptic Keras error that I didn't understand.
The text was updated successfully, but these errors were encountered: