Availability of AMD specific quantization tool for Llama2? #103

AshimaBisla · 2024-07-08T08:11:10Z

Hello,
While applying quantization on Llama model, we first convert weights downloaded from Meta and then use huggingface converter and then apply huggingface compatible AWQ quantization.

Is there a quantization tool specific to AMD, where the dependency on huggingface is removed?

Thanks,
Ashima

uday610 · 2024-07-10T17:01:00Z

Are you trying to convert PyTorch model to ONNX model? Then yes, today we use Hugging Face converter. I will check about the possible other option for future and update.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Availability of AMD specific quantization tool for Llama2? #103

Availability of AMD specific quantization tool for Llama2? #103

AshimaBisla commented Jul 8, 2024

uday610 commented Jul 10, 2024

Availability of AMD specific quantization tool for Llama2? #103

Availability of AMD specific quantization tool for Llama2? #103

Comments

AshimaBisla commented Jul 8, 2024

uday610 commented Jul 10, 2024