-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does it Work on FP16 with latest nvidia amp? #36
Comments
I will need more information to help you debug. Could you please include what command you're running? |
YES , I use {https://dl.fbaipublicfiles.com/fairseq/models/spanbert_hf.tar.gz} to runing Huggingface`s run_squad , it Failed |
We haven't tested this with the new HF code. ICYMI, there's a run_squad.py in this repo. I'd recommend using that. |
When I tried to finetune spanbert large on my own task using fp16 with the "amp" module from the apex, I meet a similar error. I am using the code from huggingface and tried both using "Spanbert/spanbert-large-cased" and the model binaries provided in this repo. Both gives the identical result, which kept telling me gradient overflow and rescaling the loss. Interestingly, when I tried to replace spanbert-large with BERT base/large, and spanbert-base model, these models work perfectly and achieved expected results. Also, the spanbert-large work very well when I turn off the fp16 training. Here I found this guy cannot run fp16 with spanbert on another task. In a conclusion, I guess that the spanbert-large model may work with the new Nvidia amp |
Met same issue here, is there any solution yet? |
No description provided.
The text was updated successfully, but these errors were encountered: