-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
model: Add granite GPTQ model #95
Conversation
Signed-off-by: Will Johnson <[email protected]>
We should enable the last bench and update the benchmarks https://github.com/foundation-model-stack/fms-acceleration/blob/main/scripts/benchmarks/scenarios-granite.yaml#L97 Update: I ran a small bench on the internal checkpoint that @willmj provided me. The numbers looked ok, though the throughout was about 300 tokens slower that the previous bench on PowerLM3B Update: decided not to update the bench as the GPTQ checkpoint is an internal checkpoint as confirmed by @tharapalanivel raw_summary.csv But we cant commit this as a new official bench because the checkpoint is not readily available, unless @willmj you can provide the commands used to generate this gptq checkpoint |
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but there is an outstanding question if we should have this as a bench
Update: also interestingly, for Update: ok the reason is because the loss function was not refactored for the granite models in |
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
GPTQ checkpoint was produced by @tharapalanivel, requested and tracked in this issue. Model was quantized using AutoGPTQ, for more info check out documentation. |
@willmj @tharapalanivel merging this PR. decided not to commit the benches as this is an internal checkpoint. But note that it clocks in slower than the BNB version |
While trying to train a quantized version of PowerLM GPTQ, I encountered the following error:
I look at the layers of the model I was trying to train (located at
/fmaas-integration-tests/models/powerlm-3b-r240924a-gptq/
:I added these layers in
plugins/accelerated-peft/src/fms_acceleration_peft/gptqmodel/models/granite.py
in a sleep pod and tried running tuning again: