You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since my objective is quite different from the original training script I implemented the training from scratch but I noticed that it takes much more time than a simple LSTM model to become somewhat decent and the results are not fully concise language even after 15 epochs on 2 million sentences. I am getting outputs that look like this
Gold label: In most cases , accurate results can only be achieved after a laborious and expensive trial and error process .
Output: only most accurate cases can be achieved after a laborious error and process results In trial and expensive suit.
Currently I am using a small model with 4 layers and 2 heads each.
I randomly initialized the position encodings and multiplied them by 0.1 to match the variance of my word embeddings.
I am trying to use this repository to train a language model with an additional input.
My data looks like this:
The labels look like this
Since my objective is quite different from the original training script I implemented the training from scratch but I noticed that it takes much more time than a simple LSTM model to become somewhat decent and the results are not fully concise language even after 15 epochs on 2 million sentences. I am getting outputs that look like this
Gold label:
In most cases , accurate results can only be achieved after a laborious and expensive trial and error process .
Output:
only most accurate cases can be achieved after a laborious error and process results In trial and expensive suit.
Currently I am using a small model with 4 layers and 2 heads each.
I randomly initialized the position encodings and multiplied them by 0.1 to match the variance of my word embeddings.
Any ideas what I could have missed?
Here is some of my code
The text was updated successfully, but these errors were encountered: