-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finetune #83
Comments
同样的问题,我在自定义数据集赏出现了tokenization mismatch:,loss为0 QAQ |
你这个model vision指的是--version v1 吗? |
yes |
thanks!改了之后也是同样的问题 |
这个 version我改成gemma后好了,你报错可能是你的json文件里有的图片地址不存在,我是过滤了他的instruction文件,只保留了coco/train2017这个地址的,不会报错 |
好的 谢谢qaq |
Hi,i finetune MGM-2B on coco, but i got the warning that:
{'loss': 6.9221, 'grad_norm': tensor(18.7422, device='cuda:0', dtype=torch.float64), 'learning_rate': 9.203084832904885e-06, 'epoch': 0.01} 1%|██▌ | 179/12941 [19:58<23:33:39, 6.65s/it]WARNING: tokenization mismatch: 330 vs. 333. (ignored) WARNING: tokenization mismatch: 309 vs. 312. (ignored) WARNING: tokenization mismatch: 438 vs. 446. (ignored) WARNING: tokenization mismatch: 80 vs. 82. (ignored) WARNING: tokenization mismatch: 388 vs. 395. (ignored) WARNING: tokenization mismatch: 84 vs. 86. (ignored) WARNING: tokenization mismatch: 222 vs. 226. (ignored) WARNING: tokenization mismatch: 207 vs. 211. (ignored) WARNING: tokenization mismatch: 86 vs. 88. (ignored) WARNING: tokenization mismatch: 140 vs. 147. (ignored) WARNING: tokenization mismatch: 155 vs. 163. (ignored) WARNING: tokenization mismatch: 283 vs. 288. (ignored) WARNING: tokenization mismatch: 545 vs. 549. (ignored) WARNING: tokenization mismatch: 543 vs. 546. (ignored)
Is there any problem with the tokenizer_config or minigemini_instruction.json?
The text was updated successfully, but these errors were encountered: