Train dtype sampling during training and clarifications. #360
Unanswered
Thomas2419
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello! I noticed that sampling during training uses the train_data dtype, is there any particular reason for using that instead of of the model dtype? Does the train data dtype effect mainly the dtype of input data and the gradients as opposed to directly effecting the dtype of the model weights? That was what I had gathered but wanted to verify.
I saw that specifically it stated: "Internally, this sets the mixed precision data type when doing the forward pass through the model. This setting trades precision for speed during training"
So from what I'm understanding train dtype is useful for keeping model in standard precision while making less accurate changes, and essentially doing the equivalent of inference at a specified dtype while keeping weights in the other.
Beta Was this translation helpful? Give feedback.
All reactions