diff --git a/docs/source/integrations.mdx b/docs/source/integrations.mdx index a131ad105..55a685779 100644 --- a/docs/source/integrations.mdx +++ b/docs/source/integrations.mdx @@ -21,7 +21,7 @@ Few references: # Blog posts - [Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA](https://huggingface.co/blog/4bit-transformers-bitsandbytes) - +- [A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using Hugging Face Transformers, Accelerate and bitsandbytes](https://huggingface.co/blog/hf-bitsandbytes-integration) ### For instructions how to use LLM.int8() inference layers in your own code, see the TL;DR above or for extended instruction see [this blog post](https://huggingface.co/blog/hf-bitsandbytes-integration).