From 1f36bd4cf24d221e61cf2609b7c6170e955222bf Mon Sep 17 00:00:00 2001 From: Titus <9048635+Titus-von-Koeller@users.noreply.github.com> Date: Mon, 26 Feb 2024 16:12:46 +0100 Subject: [PATCH 1/2] docs: fix link text --- docs/source/integrations.mdx | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/source/integrations.mdx b/docs/source/integrations.mdx index 0e37765c5..bcba6e5e5 100644 --- a/docs/source/integrations.mdx +++ b/docs/source/integrations.mdx @@ -2,7 +2,7 @@ With Transformers it's very easy to load any model in 4 or 8-bit, quantizing them on the fly with bitsandbytes primitives. -Please review the [bitsandbytes section in the Accelerate docs](https://huggingface.co/docs/transformers/v4.37.2/en/quantization#bitsandbytes). +Please review the [bitsandbytes section in the Transformers docs](https://huggingface.co/docs/transformers/v4.37.2/en/quantization#bitsandbytes). Details about the BitsAndBytesConfig can be found [here](https://huggingface.co/docs/transformers/v4.37.2/en/main_classes/quantization#transformers.BitsAndBytesConfig). @@ -21,7 +21,7 @@ quantization_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dty # PEFT With `PEFT`, you can use QLoRA out of the box with `LoraConfig` and a 4-bit base model. -Please review the [bitsandbytes section in the Accelerate docs](https://huggingface.co/docs/peft/developer_guides/quantization#quantize-a-model). +Please review the [bitsandbytes section in the PEFT docs](https://huggingface.co/docs/peft/developer_guides/quantization#quantize-a-model). # Accelerate From a03df4325dfa8e25f9780d1b854870d85a972898 Mon Sep 17 00:00:00 2001 From: Sebastian Raschka Date: Mon, 26 Feb 2024 13:42:23 -0600 Subject: [PATCH 2/2] Lit-GPT integration docs (#1089) * lit-gpt integration * mention PT lightning --- docs/source/integrations.mdx | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/docs/source/integrations.mdx b/docs/source/integrations.mdx index bcba6e5e5..67d50d6a0 100644 --- a/docs/source/integrations.mdx +++ b/docs/source/integrations.mdx @@ -29,6 +29,25 @@ Bitsandbytes is also easily usable from within Accelerate. Please review the [bitsandbytes section in the Accelerate docs](https://huggingface.co/docs/accelerate/en/usage_guides/quantization). + + +# PyTorch Lightning and Lightning Fabric + +Bitsandbytes is available from within both +- [PyTorch Lightning](https://lightning.ai/docs/pytorch/stable/), a deep learning framework for professional AI researchers and machine learning engineers who need maximal flexibility without sacrificing performance at scale; +- and [Lightning Fabric](https://lightning.ai/docs/fabric/stable/), a fast and lightweight way to scale PyTorch models without boilerplate). + +Please review the [bitsandbytes section in the PyTorch Lightning docs](https://lightning.ai/docs/pytorch/stable/common/precision_intermediate.html#quantization-via-bitsandbytes). + + +# Lit-GPT + +Bitsandbytes is integrated into [Lit-GPT](https://github.com/Lightning-AI/lit-gpt), a hackable implementation of state-of-the-art open-source large language models, based on Lightning Fabric, where it can be used for quantization during training, finetuning, and inference. + +Please review the [bitsandbytes section in the Lit-GPT quantization docs](https://github.com/Lightning-AI/lit-gpt/blob/main/tutorials/quantize.md). + + + # Trainer for the optimizers You can use any of the 8-bit and/or paged optimizers by simple passing them to the `transformers.Trainer` class on initialization.All bnb optimizers are supported by passing the correct string in `TrainingArguments`'s `optim` attribute - e.g. (`paged_adamw_32bit`).