You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to reproduce the Mistral-7B-SPPO Iter1 model. However, after my first iteration, the model I trained diverged significantly from the published Mistral-7B-SPPO Iter1 model when comparing the results on a benchmark dataset.
To help with diagnosing the issue and improving my training, could you kindly provide the training dataset and the loss log so that I can compare my run to?
I was using this prompt dataset, but it doesn't contains the columns chosen, rejected, chosen_probs, chosen_probs_win, chosen_probs_lose. While running the generate.sh script can produce these columns, it would be incredibly helpful if the actual training dataset could be released. This would make it much easier for me to debug and identify where things might have gone wrong in my training process.
The text was updated successfully, but these errors were encountered:
I am trying to reproduce the Mistral-7B-SPPO Iter1 model. However, after my first iteration, the model I trained diverged significantly from the published Mistral-7B-SPPO Iter1 model when comparing the results on a benchmark dataset.
To help with diagnosing the issue and improving my training, could you kindly provide the training dataset and the loss log so that I can compare my run to?
I was using this prompt dataset, but it doesn't contains the columns
chosen, rejected, chosen_probs, chosen_probs_win, chosen_probs_lose
. While running the generate.sh script can produce these columns, it would be incredibly helpful if the actual training dataset could be released. This would make it much easier for me to debug and identify where things might have gone wrong in my training process.The text was updated successfully, but these errors were encountered: