You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The wiki suggests a batch size of 128 is recommended for 'stable training'.
It would be helpful to have the option to accumulate gradients so that bicleaner-ai training with larger "effective batch size" were possible on GPUs with a relatively small amount of RAM.
Fairseq calls this option "--update-freq"
Sockeye calls this option "--update-interval"
The text was updated successfully, but these errors were encountered:
Hi @radinplaid, I agree and I've been thinking of it since I did the tool. Unfortunately Tensorflow does not support it natively, so it would require us to replace the tensorflow training loop function with our handmade function. Maybe at some point I'll will have time to implement it. I'm gladly to accept PRs if someone wants to write it.
The wiki suggests a batch size of 128 is recommended for 'stable training'.
It would be helpful to have the option to accumulate gradients so that bicleaner-ai training with larger "effective batch size" were possible on GPUs with a relatively small amount of RAM.
Fairseq calls this option "--update-freq"
Sockeye calls this option "--update-interval"
The text was updated successfully, but these errors were encountered: