Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finetune #83

Open
ZhangScream opened this issue Apr 22, 2024 · 7 comments
Open

Finetune #83

ZhangScream opened this issue Apr 22, 2024 · 7 comments

Comments

@ZhangScream
Copy link

Hi,i finetune MGM-2B on coco, but i got the warning that:
{'loss': 6.9221, 'grad_norm': tensor(18.7422, device='cuda:0', dtype=torch.float64), 'learning_rate': 9.203084832904885e-06, 'epoch': 0.01} 1%|██▌ | 179/12941 [19:58<23:33:39, 6.65s/it]WARNING: tokenization mismatch: 330 vs. 333. (ignored) WARNING: tokenization mismatch: 309 vs. 312. (ignored) WARNING: tokenization mismatch: 438 vs. 446. (ignored) WARNING: tokenization mismatch: 80 vs. 82. (ignored) WARNING: tokenization mismatch: 388 vs. 395. (ignored) WARNING: tokenization mismatch: 84 vs. 86. (ignored) WARNING: tokenization mismatch: 222 vs. 226. (ignored) WARNING: tokenization mismatch: 207 vs. 211. (ignored) WARNING: tokenization mismatch: 86 vs. 88. (ignored) WARNING: tokenization mismatch: 140 vs. 147. (ignored) WARNING: tokenization mismatch: 155 vs. 163. (ignored) WARNING: tokenization mismatch: 283 vs. 288. (ignored) WARNING: tokenization mismatch: 545 vs. 549. (ignored) WARNING: tokenization mismatch: 543 vs. 546. (ignored)
Is there any problem with the tokenizer_config or minigemini_instruction.json?

@HongLouyemeng
Copy link

HongLouyemeng commented Apr 23, 2024

Hi,i finetune MGM-2B on coco, but i got the warning that: {'loss': 6.9221, 'grad_norm': tensor(18.7422, device='cuda:0', dtype=torch.float64), 'learning_rate': 9.203084832904885e-06, 'epoch': 0.01} 1%|██▌ | 179/12941 [19:58<23:33:39, 6.65s/it]WARNING: tokenization mismatch: 330 vs. 333. (ignored) WARNING: tokenization mismatch: 309 vs. 312. (ignored) WARNING: tokenization mismatch: 438 vs. 446. (ignored) WARNING: tokenization mismatch: 80 vs. 82. (ignored) WARNING: tokenization mismatch: 388 vs. 395. (ignored) WARNING: tokenization mismatch: 84 vs. 86. (ignored) WARNING: tokenization mismatch: 222 vs. 226. (ignored) WARNING: tokenization mismatch: 207 vs. 211. (ignored) WARNING: tokenization mismatch: 86 vs. 88. (ignored) WARNING: tokenization mismatch: 140 vs. 147. (ignored) WARNING: tokenization mismatch: 155 vs. 163. (ignored) WARNING: tokenization mismatch: 283 vs. 288. (ignored) WARNING: tokenization mismatch: 545 vs. 549. (ignored) WARNING: tokenization mismatch: 543 vs. 546. (ignored) Is there any problem with the tokenizer_config or minigemini_instruction.json?

同样的问题,我在自定义数据集赏出现了tokenization mismatch:,loss为0 QAQ

@HongLouyemeng
Copy link

HongLouyemeng commented Apr 23, 2024

修改model vision就行,但训练过程出现:Error in loading 1833, retrying...
image

@ZhangScream
Copy link
Author

你这个model vision指的是--version v1 吗?

@HongLouyemeng
Copy link

你这个model vision指的是--version v1 吗?

yes

@ZhangScream
Copy link
Author

你这个model vision指的是--version v1 吗?

yes

thanks!改了之后也是同样的问题

@ZhangScream
Copy link
Author

修改model vision就行,但训练过程出现:Error in loading 1833, retrying... image

这个 version我改成gemma后好了,你报错可能是你的json文件里有的图片地址不存在,我是过滤了他的instruction文件,只保留了coco/train2017这个地址的,不会报错
{'loss': 17.621, 'grad_norm': 1097.9221859404172, 'learning_rate': 5.141388174807198e-08, 'epoch': 0.0} {'loss': 19.8897, 'grad_norm': 1191.090334852173, 'learning_rate': 1.0282776349614396e-07, 'epoch': 0.0} {'loss': 17.617, 'grad_norm': 1020.9409156811793, 'learning_rate': 1.5424164524421595e-07, 'epoch': 0.0} {'loss': 18.9953, 'grad_norm': 1134.2098670413347, 'learning_rate': 2.0565552699228793e-07, 'epoch': 0.0} {'loss': 18.5601, 'grad_norm': 1084.7206032578688, 'learning_rate': 2.5706940874035993e-07, 'epoch': 0.0} {'loss': 17.2726, 'grad_norm': 1040.5268230990873, 'learning_rate': 3.084832904884319e-07, 'epoch': 0.0} {'loss': 18.7021, 'grad_norm': 1164.2017706625636, 'learning_rate': 3.598971722365039e-07, 'epoch': 0.0} {'loss': 18.1496, 'grad_norm': 1044.5027659595357, 'learning_rate': 4.1131105398457585e-07, 'epoch': 0.0} {'loss': 15.6641, 'grad_norm': 1037.2660428641016, 'learning_rate': 4.6272493573264783e-07, 'epoch': 0.0} {'loss': 18.7966, 'grad_norm': 1129.7304014927308, 'learning_rate': 5.141388174807199e-07, 'epoch': 0.0} {'loss': 17.0625, 'grad_norm': 1024.579069798291, 'learning_rate': 5.655526992287918e-07, 'epoch': 0.0} {'loss': 14.5877, 'grad_norm': 830.1190842498937, 'learning_rate': 6.169665809768638e-07, 'epoch': 0.0} {'loss': 15.3645, 'grad_norm': 843.60380558662, 'learning_rate': 6.683804627249357e-07, 'epoch': 0.0} {'loss': 12.0782, 'grad_norm': 325.8663246509485, 'learning_rate': 7.197943444730078e-07, 'epoch': 0.0} {'loss': 11.7363, 'grad_norm': 385.4385170199879, 'learning_rate': 7.712082262210797e-07, 'epoch': 0.0} {'loss': 11.5988, 'grad_norm': 380.7032156615642, 'learning_rate': 8.226221079691517e-07, 'epoch': 0.0} {'loss': 10.91, 'grad_norm': 260.234503595527, 'learning_rate': 8.740359897172238e-07, 'epoch': 0.0} {'loss': 11.4197, 'grad_norm': 353.49476985450883, 'learning_rate': 9.254498714652957e-07, 'epoch': 0.0} {'loss': 11.053, 'grad_norm': 326.06150885488125, 'learning_rate': 9.768637532133676e-07, 'epoch': 0.0} {'loss': 9.7236, 'grad_norm': 144.73662083460025, 'learning_rate': 1.0282776349614397e-06, 'epoch': 0.0} {'loss': 10.4556, 'grad_norm': 275.81074270511215, 'learning_rate': 1.0796915167095116e-06, 'epoch': 0.0} {'loss': 10.1932, 'grad_norm': 258.5231596765796, 'learning_rate': 1.1311053984575837e-06, 'epoch': 0.0}

@HongLouyemeng
Copy link

修改model vision就行,但训练过程出现:Error in loading 1833, retrying... image

这个 version我改成gemma后好了,你报错可能是你的json文件里有的图片地址不存在,我是过滤了他的instruction文件,只保留了coco/train2017这个地址的,不会报错 {'loss': 17.621, 'grad_norm': 1097.9221859404172, 'learning_rate': 5.141388174807198e-08, 'epoch': 0.0} {'loss': 19.8897, 'grad_norm': 1191.090334852173, 'learning_rate': 1.0282776349614396e-07, 'epoch': 0.0} {'loss': 17.617, 'grad_norm': 1020.9409156811793, 'learning_rate': 1.5424164524421595e-07, 'epoch': 0.0} {'loss': 18.9953, 'grad_norm': 1134.2098670413347, 'learning_rate': 2.0565552699228793e-07, 'epoch': 0.0} {'loss': 18.5601, 'grad_norm': 1084.7206032578688, 'learning_rate': 2.5706940874035993e-07, 'epoch': 0.0} {'loss': 17.2726, 'grad_norm': 1040.5268230990873, 'learning_rate': 3.084832904884319e-07, 'epoch': 0.0} {'loss': 18.7021, 'grad_norm': 1164.2017706625636, 'learning_rate': 3.598971722365039e-07, 'epoch': 0.0} {'loss': 18.1496, 'grad_norm': 1044.5027659595357, 'learning_rate': 4.1131105398457585e-07, 'epoch': 0.0} {'loss': 15.6641, 'grad_norm': 1037.2660428641016, 'learning_rate': 4.6272493573264783e-07, 'epoch': 0.0} {'loss': 18.7966, 'grad_norm': 1129.7304014927308, 'learning_rate': 5.141388174807199e-07, 'epoch': 0.0} {'loss': 17.0625, 'grad_norm': 1024.579069798291, 'learning_rate': 5.655526992287918e-07, 'epoch': 0.0} {'loss': 14.5877, 'grad_norm': 830.1190842498937, 'learning_rate': 6.169665809768638e-07, 'epoch': 0.0} {'loss': 15.3645, 'grad_norm': 843.60380558662, 'learning_rate': 6.683804627249357e-07, 'epoch': 0.0} {'loss': 12.0782, 'grad_norm': 325.8663246509485, 'learning_rate': 7.197943444730078e-07, 'epoch': 0.0} {'loss': 11.7363, 'grad_norm': 385.4385170199879, 'learning_rate': 7.712082262210797e-07, 'epoch': 0.0} {'loss': 11.5988, 'grad_norm': 380.7032156615642, 'learning_rate': 8.226221079691517e-07, 'epoch': 0.0} {'loss': 10.91, 'grad_norm': 260.234503595527, 'learning_rate': 8.740359897172238e-07, 'epoch': 0.0} {'loss': 11.4197, 'grad_norm': 353.49476985450883, 'learning_rate': 9.254498714652957e-07, 'epoch': 0.0} {'loss': 11.053, 'grad_norm': 326.06150885488125, 'learning_rate': 9.768637532133676e-07, 'epoch': 0.0} {'loss': 9.7236, 'grad_norm': 144.73662083460025, 'learning_rate': 1.0282776349614397e-06, 'epoch': 0.0} {'loss': 10.4556, 'grad_norm': 275.81074270511215, 'learning_rate': 1.0796915167095116e-06, 'epoch': 0.0} {'loss': 10.1932, 'grad_norm': 258.5231596765796, 'learning_rate': 1.1311053984575837e-06, 'epoch': 0.0}

好的 谢谢qaq

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants