You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since llama models are using left padding, the supervised training dialoguecollator would cause the label_mask to pad in a different direction as the tokenizer.pad (input_ids, attention_mask), as torch.stack (label_mask) implements the right padding strategy.
Printing out the dataloader results in trainer_sft.py would also verify the issue
Since llama models are using left padding, the supervised training dialoguecollator would cause the label_mask to pad in a different direction as the tokenizer.pad (input_ids, attention_mask), as torch.stack (label_mask) implements the right padding strategy.
Printing out the dataloader results in trainer_sft.py would also verify the issue
I think there's no padding_side assigned to right in the trainer_sft.py pipeline, so by default llama models we have trained are bit faulty
The text was updated successfully, but these errors were encountered: