-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: output with shape [256] doesn't match the broadcast shape [256, 256] #234
Comments
Hi Ajay, you may be running with the data shaped differently. We expect that the At benchmark.py:125 ( Also, there is the LSTM example for a different sequence task here which may be helpful. |
Hi Jason, |
The shape is reasonable to me, can you check whether your code matches the code block from this previous issue #225? That works with the latest neurobench package 1.0.6, as well as any arbitrary batch size. If there is still issues, please post your code block so we can inspect the error. |
Ohh, I see. I didn't get the latest version. How do I get it? Do I run .bumpversion.toml? |
or if you are using poetry and a local cloned repo, then simply |
Still getting the same issue. Can you tell which code has been modified. Ill check if the changes have been updated. |
Yes the minimal example code runs. Here's my model definition
|
This is the benchmark code
|
Hi Ajay, I noticed that your LSTMCell forward call does not include the (h, c) in the inputs. Based on the documentation, if these are not included, I believe that the recurrent state of the LSTM is not tracked at all, and essentially the LSTM block is just an MLP-type transform. I may be wrong on this, though. Regardless, note that all of our other LSTM examples use the forward convention for the LSTMCell By making additions to your model definition shown in the below code block, there is no longer a harness runtime error:
The harness should be able to support the case where (h, c) is not passed into the LSTMCell, so this is still an issue. But I recommend that you include (h, c) in the inputs. |
Aah, I see. I read somewhere in the documentation that LSTMs by default initialize their hidden and cell states to a tensor of 0s, that's why I didn't explicitly add it. Thanks a lot!! |
Also will I have to retrain my models with these changes incorporated? I just changed the model but passed the same weights I had before the explicit h and c definition and the neurobench benchmarks are running fine. |
My guess is that you will need to retrain the model, as it is now tracking recurrent state and it wasn't before. I suggest that you take out all of the metrics except the R2 workload metric and first verify you are getting the expected accuracy before considering the compute complexity. |
Alright thanks a lot! |
TODO: support synops for RNNCells which do not use recurrent input |
I have trained a Recurrent network using an LSTMCell and MLP layers. But when I load the model and the weights for running the benchmark, I get "RuntimeError: output with shape [256] doesn't match the broadcast shape [256, 256]". Tracing it backwards, it originates from the utils.py file on line 291 (out += biases). On printing the shapes of out and biases, I got [256] and [256, 1] respectively. Squeezing out the 2nd dimension from biases resolves the issue, but I am unsure whether there is a mistake with the benchmark code or with how my model is defined. I faced a similar issue on using a GRUCell. Can I please get some help?
The text was updated successfully, but these errors were encountered: