You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @cinjon did you figure it out? It's confusing. Also the actual batch size seems to be 256 (2 * 8 * 16), so there should be about 232 steps for 1 epoch.
Hi, I saw your training curve for Gemma 9b SimPO here: https://wandb.ai/yumeng0818/simpo/runs/4w25j650?nw=nwuseryumeng0818.
How is it that there's only 92 steps? At 128 batch size, that would only be 11k total examples seen, but there's ~60k in the dataset.
Thanks.
The text was updated successfully, but these errors were encountered: