Skip to content

Commit

Permalink
run pre-commit hooks
Browse files Browse the repository at this point in the history
  • Loading branch information
Titus-von-Koeller committed Feb 3, 2024
1 parent 543a7b1 commit 2d73f4d
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 6 deletions.
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.1.15
rev: v0.2.0
hooks:
- id: ruff
args:
Expand Down
8 changes: 4 additions & 4 deletions bitsandbytes/nn/modules.py
Original file line number Diff line number Diff line change
Expand Up @@ -226,11 +226,11 @@ def to(self, *args, **kwargs):

class Linear4bit(nn.Linear):
"""
This class is the base module for the 4-bit quantization algorithm presented in [QLoRA](https://arxiv.org/abs/2305.14314).
This class is the base module for the 4-bit quantization algorithm presented in [QLoRA](https://arxiv.org/abs/2305.14314).
QLoRA 4-bit linear layers uses blockwise k-bit quantization under the hood, with the possibility of selecting various
compute datatypes such as FP4 and NF4.
In order to quantize a linear layer one should first load the original fp16 / bf16 weights into
In order to quantize a linear layer one should first load the original fp16 / bf16 weights into
the Linear8bitLt module, then call `quantized_module.to("cuda")` to quantize the fp16 / bf16 weights.
Example:
Expand Down Expand Up @@ -442,10 +442,10 @@ def maybe_rearrange_weight(state_dict, prefix, local_metadata, strict, missing_k

class Linear8bitLt(nn.Linear):
"""
This class is the base module for the [LLM.int8()](https://arxiv.org/abs/2208.07339) algorithm.
This class is the base module for the [LLM.int8()](https://arxiv.org/abs/2208.07339) algorithm.
To read more about it, have a look at the paper.
In order to quantize a linear layer one should first load the original fp16 / bf16 weights into
In order to quantize a linear layer one should first load the original fp16 / bf16 weights into
the Linear8bitLt module, then call `int8_module.to("cuda")` to quantize the fp16 weights.
Example:
Expand Down
2 changes: 1 addition & 1 deletion docs/source/quantization.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,4 @@ Below you will find the docstring of the quantization primitives exposed in bits

## StableEmbedding

[[autodoc]] bitsandbytes.nn.StableEmbedding
[[autodoc]] bitsandbytes.nn.StableEmbedding

0 comments on commit 2d73f4d

Please sign in to comment.