Skip to content

Commit

Permalink
[Misc] Improve BNB loader to handle mixture of sharded and merged wei…
Browse files Browse the repository at this point in the history
…ghts with same suffix (#11566)

Signed-off-by: Isotr0py <[email protected]>
  • Loading branch information
Isotr0py authored Dec 27, 2024
1 parent 0240402 commit dde1fa1
Showing 1 changed file with 5 additions and 2 deletions.
7 changes: 5 additions & 2 deletions vllm/model_executor/model_loader/loader.py
Original file line number Diff line number Diff line change
Expand Up @@ -1001,8 +1001,11 @@ def _get_bnb_target_modules(self, model: nn.Module) -> None:
for sub_name in sub_modules:
self.target_modules.append(
name.replace(last_name, sub_name))
else:
self.target_modules.append(name)
# Add original module name even if the module has stacked map,
# in case model has a mixture of disk-merged and disk-splitted
# weights with same last name.
self.target_modules.append(name)

assert (self.target_modules
), "vllm currently does not support BNB quantization for"
f" {type(model).__name__}"
Expand Down

0 comments on commit dde1fa1

Please sign in to comment.