-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: add liger kernel with fused cross entropy loss (#93)
* initial implementation of fused-linear-loss on llama Signed-off-by: 1000850000 user <[email protected]> Signed-off-by: Anh Uong <[email protected]> * syntax fixes and remove unused code Signed-off-by: Anh Uong <[email protected]> * add new num_logits_to_keep arg for llama.forward() Signed-off-by: Anh Uong <[email protected]> * add mixtral model patch Signed-off-by: Anh Uong <[email protected]> * add mistral and granite model patch Signed-off-by: Anh Uong <[email protected]> * add benchmark Signed-off-by: Anh Uong <[email protected]> * add new liger benchmarks Signed-off-by: Anh Uong <[email protected]> * some fixes Signed-off-by: Yu Chin Fabian Lim <[email protected]> * revise benches Signed-off-by: Yu Chin Fabian Lim <[email protected]> * refactor to fused_ops Signed-off-by: Yu Chin Fabian Lim <[email protected]> * fix fmt + lint Signed-off-by: Yu Chin Fabian Lim <[email protected]> * update full benches and readme Signed-off-by: Yu Chin Fabian Lim <[email protected]> * fix fast foak configs Signed-off-by: Yu Chin Fabian Lim <[email protected]> * docs: update foak readme benchmarks Signed-off-by: Anh Uong <[email protected]> --------- Signed-off-by: 1000850000 user <[email protected]> Signed-off-by: Anh Uong <[email protected]> Signed-off-by: Yu Chin Fabian Lim <[email protected]> Co-authored-by: 1000850000 user <[email protected]> Co-authored-by: Yu Chin Fabian Lim <[email protected]>
- Loading branch information
1 parent
c70ffe0
commit 733992a
Showing
25 changed files
with
1,326 additions
and
18 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
25 changes: 25 additions & 0 deletions
25
plugins/fused-ops-and-kernels/configs/fast_kernels_liger.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
training: | ||
|
||
fused_ops_and_kernels: | ||
|
||
# if under training stanza, then putting | ||
# base_layer and fused_lora will be a misnomer | ||
# - this should be in peft.quantized | ||
# However, if it is specified, it will still | ||
# be read. This is useful in use cases where | ||
# the yaml is system generated and not shown | ||
# to a user. | ||
|
||
# activate various unsloth optimizations | ||
# there are two versions of the plugin | ||
# - the FastKernel version supports individual kernels | ||
# - the FastQuantized version is all-or-nothing | ||
|
||
# fast loss triton kernels | ||
fast_loss: fused_ce_liger | ||
|
||
# fast rms norm triton kernels | ||
fast_rms_layernorm: True | ||
|
||
# fast RoPE embedding triton kernels | ||
fast_rope_embeddings: True |
30 changes: 30 additions & 0 deletions
30
plugins/fused-ops-and-kernels/configs/fast_quantized_peft_liger.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# PEFT-related acceleration | ||
peft: | ||
|
||
# quantization-releated acceleration | ||
# e.g., kernels for quantized base weights | ||
quantization: | ||
|
||
fused_ops_and_kernels: | ||
|
||
# load unsloth optimizations for these 4bit base layer weights. | ||
# currently only support "auto_gptq" and "bitsandbytes" | ||
base_layer: auto_gptq | ||
|
||
# activate various unsloth optimizations | ||
# there are two versions of the plugin | ||
# - the FastKernel version supports individual kernels | ||
# - the FastQuantized version is all-or-nothing | ||
|
||
|
||
# fused kernels for lora linear layers | ||
fused_lora: True | ||
|
||
# fast loss triton kernels | ||
fast_loss: fused_ce_liger | ||
|
||
# fast rms norm triton kernels | ||
fast_rsm_layernorm: True | ||
|
||
# fast RoPE embedding triton kernels | ||
fast_rope_embeddings: True |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
27 changes: 27 additions & 0 deletions
27
plugins/fused-ops-and-kernels/src/fms_acceleration_foak/fused_ops/liger_ce/__init__.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# Copyright 2024 Byron Hsu & Linkedin team. All rights reserved. | ||
# | ||
# BSD 2-CLAUSE LICENSE | ||
# Copyright 2024 LinkedIn Corporation | ||
# All Rights Reserved. | ||
# Redistribution and use in source and binary forms, with or | ||
# without modification, are permitted provided that the following | ||
# conditions are met: | ||
# 1. Redistributions of source code must retain the above copyright | ||
# notice, this list of conditions and the following disclaimer. | ||
# 2. Redistributions in binary form must reproduce the above | ||
# copyright notice, this list of conditions and the following | ||
# disclaimer in the documentation and/or other materials provided | ||
# with the distribution. | ||
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS | ||
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT | ||
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR | ||
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT | ||
# HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, | ||
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT | ||
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, | ||
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY | ||
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | ||
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
|
||
from .fused_linear_cross_entropy_loss import lce_forward |
Oops, something went wrong.