Replies: 1 comment
-
To make QAT work with other models should be pretty straight forward, you just swap out the model and tokenizer part of the config to match the config options for those other models. We may offer a single device recipe for QAT in the future, but for now we haven't been prioritizing it. You can always run a distributed recipe on single device though by setting "--nproc_per_node 1" though. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'd like to experiment with QAT. I see "tune ls" shows there is a QAT recipe available for the Llama3 model but only distributed and only for full fine tuning. Any chance of making additional recipes for Llama 3.1 or 3.2 on a single GPU?
Beta Was this translation helpful? Give feedback.
All reactions