We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
您好,使用原始代码在2张A100 80G上面微调qwen,显存占用两张卡上都只有919M,但是在数据加载过程中?内存占用一直在增加,直到180多G后内存爆了,程序终止。请问这个问题怎么解? 训练log:
内存占用:
The text was updated successfully, but these errors were encountered:
多大的qwen?
Sorry, something went wrong.
qwen-vl, 7b
bsz 可以调一下?他的词表有 100k 左右所以最后的activation很大,bsz=1 看看能不能跑起来吧,我记得 80G 是可以跑到 per_device_batch_size=4 的,然后调 gradient_accumulation_step 来保证 global_batch_size
No branches or pull requests
您好,使用原始代码在2张A100 80G上面微调qwen,显存占用两张卡上都只有919M,但是在数据加载过程中?内存占用一直在增加,直到180多G后内存爆了,程序终止。请问这个问题怎么解?
训练log:
内存占用:
The text was updated successfully, but these errors were encountered: