You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There has been significant refactoring of the loss functions for transformers 4.46, that will render the cross entropy patching ineffective. Need to have a different ModelPatcherRule for the new transformers version. CC: @anhuong
model has migrated to the custom_loss_function API
model has not migrated (like Granite now)
For 3. This is the easy one, because it means no code changes
For 1. Im thinking we do not patch anything, because if a user wants to do this, we cant control what loss function they use
For 2. In this case we want to patchfixed_cross_entropy, but this should be done on a per-model basis. So we need to somehow have the model instantiate the loss function, e.g., ForCausalLMLoss, and only patch fixed_cross_entropy during this instantiation process, and put it back to original after it is done
The text was updated successfully, but these errors were encountered:
fabianlim
changed the title
FOAK Cross Entropy Loss Will Not Work with New Loss Functions
FOAK Cross Entropy Loss Will Not Work with New Loss Functions After Transformers 4.46
Oct 29, 2024
There has been significant refactoring of the loss functions for
transformers 4.46
, that will render the cross entropy patching ineffective. Need to have a differentModelPatcherRule
for the new transformers version. CC: @anhuonghuggingface/transformers#34191
So now there are 3 possiblities
custom_loss_function
is passed intoTrainer
custom_loss_function
APIFor 3. This is the easy one, because it means no code changes
For 1. Im thinking we do not patch anything, because if a user wants to do this, we cant control what loss function they use
For 2. In this case we want to patch
fixed_cross_entropy
, but this should be done on a per-model basis. So we need to somehow have the model instantiate the loss function, e.g.,ForCausalLMLoss
, and only patchfixed_cross_entropy
during this instantiation process, and put it back to original after it is doneThe text was updated successfully, but these errors were encountered: