Skip to content

Commit

Permalink
(docs) integrations: fix omission in bf16 related warning (#1183)
Browse files Browse the repository at this point in the history
* (docs) integrations: fix omission in bf16 related warning

* (docs) integrations: further clarifications to prior fix

* (docs) integrations: fix punctuation

Co-authored-by: Steven Liu <[email protected]>

* (docs) integrations: fix omitted code formatting

---------

Co-authored-by: Steven Liu <[email protected]>
  • Loading branch information
Titus-von-Koeller and stevhliu authored Apr 17, 2024
1 parent 6cecb65 commit ffd7d0d
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion docs/source/integrations.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ With Transformers, it's very easy to load any model in 4 or 8-bit and quantize t
For example, to load and quantize a model to 4-bits and use the bfloat16 data type for compute:

> [!WARNING]
> bfloat16 is the optimal compute data type if your hardware supports it. The default is float32 for backward compatibility and numerical stability, but it can often lead to numerical instabilities. bfloat16 provides the best of both worlds, numerical stability equivalent to float32, but combined with the memory footprint and significant computation speedup of a 16-bit data type. Make sure to check if your hardware supports bfloat16 and if it does, configure it using the `bnb_4bit_compute_dtype` parameter in [`~transformers.BitsAndBytesConfig`]!
> bfloat16 is the ideal `compute_dtype` if your hardware supports it. While the default `compute_dtype`, float32, ensures backward compatibility (due to wide-ranging hardware support) and numerical stability, it is large and slows down computations. In contrast, float16 is smaller and faster but can lead to numerical instabilities. bfloat16 combines the best aspects of both; it offers the numerical stability of float32 and the reduced memory footprint and speed of a 16-bit data type. Check if your hardware supports bfloat16 and configure it using the `bnb_4bit_compute_dtype` parameter in [`~transformers.BitsAndBytesConfig`]!
```py
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
Expand Down

0 comments on commit ffd7d0d

Please sign in to comment.