-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added DeepSeek V3 support. #688
base: main
Are you sure you want to change the base?
Conversation
I will test it immediately after dinner 😊 thank you! |
I still get this error, when I try it on https://huggingface.co/opensourcerelease/DeepSeek-V3-bf16
My script:
|
Seems like the error is from Transformers, not AutoAWQ.🤔 Can you try updating Transformers to a developer version by installing from source? Because when I check the errored line, it's Also in your script |
I had to make this change to modeling.py https://huggingface.co/deepseek-ai/DeepSeek-V3/discussions/23/files |
It seems to be working. 22 hours remaining I'll update here when I get it published and tested |
Are you sure all the uninitialized weights in the logs above are OK? |
No I'm not sure. What do you think? |
I have an extra node, if you want we could do a call and look at it together |
Can you try load it in just Transformers and do CPU off-load to maybe run a few tokens to see if it's actually a working model? Also check if you get the same uninit weights warning in just Transformers too. I can't do voice call but if you can add my Discord @v2ray if you want to, so we can text to get updates quicker. |
My machine crashed today. Started over. I'll try what you say |
Tested on vLLM, success, coherent. 5 TPS with a single query, 80-100 TPS with 100 simultaneous queries |
@casper-hansen can this be merged? |
#686
I only tested using randomly initialized weights on a 1B version of the model, so this needs further testing for the big 671B model.
Also due to the group size limitation in the gemm CUDA kernel, the group size can only be set to <= 64 or no group size at all.
The testing models are at https://huggingface.co/v2ray/DeepSeek-V3-1B-Test and https://huggingface.co/v2ray/DeepSeek-V3-1B-Test-AWQ.
If anyone can test on the big 671B model thank you so much!!!!!🥺
@casper-hansen @ehartford