Adapter-LoRa for Quantization

Features

LoRALib Approach: This approach involves calculating the computations xW_0^T and x(BA)^T separately, followed by their summation. This approach is particularly suitable for linear layers and offers accurate computation of LoRA-enhanced layers.
LoRATorch Approach: In this approach, the pre-trained weight W_0 is merged with its LoRA weight BA, resulting in the combined weight matrix (W_0 + \frac{\alpha}{r} BA). This approach allows for the straightforward extension of LoRA to more complex and non-linear layers within the PyTorch ecosystem.

Mathematical Formulation

LoRALib Approach:

The computation is defined as:

$( h = xW_0^T + \frac{\alpha}{r} x(BA)^T )$

$where:
- ( x ) is the input matrix of dimensions ( k \times n ),
- ( W_0 ) is a pre-trained weight matrix of dimensions ( m \times n ),
- ( r ) is a predefined LoRA rank,
- ( B ) and ( A ) are LoRA matrices of dimensions ( m \times r ) and ( r \times n ) respectively,
- ( \alpha ) is a hyper-parameter.$
LoRATorch Approach:

The computation is defined as:

$( h = x(W_0 + \frac{\alpha}{r} BA)^T )$

$where:
- ( x ) is the input matrix of dimensions ( k \times n ),
- ( W_0 ) is a pre-trained weight matrix of dimensions ( m \times n ),
- ( r ) is a predefined LoRA rank,
- ( B ) and ( A ) are LoRA matrices of dimensions ( m \times r ) and ( r \times n ) respectively,
- ( \alpha ) is a hyper-parameter.$

Usage

AdapterLoRa Class: The AdapterLoRa class provides a versatile interface for applying LoRA adaptation to neural networks. It supports both loralib and loratorch approaches, offering the ability to reconstruct and implement LoRA-adapted models.
Adapting Layers: The add_layer_and_Instance_Layer method allows you to specify the layers you want to adapt using the layertyep and layer parameters. This method helps tailor the LoRA application to specific layers in your model.
Freezing Weights: The freeze_weights method enables the option to freeze model weights, enhancing stability and allowing for safer adaptations.
Reconstructing and Implementing LoRA: The reconstruct_model method applies LoRA adaptation to the model, while the implement_lora method further implements LoRA and manages trainable parameters. .

Supported Layers

	`loralib`	`loratorch`
`nn.Linear`	✓	✓	linear.ipynb
`nn.Embedding`	✓	✓	embedding.ipynb
`nn.Conv1d`	✓	✓
`nn.Conv2d`	✓	✓
`nn.Conv3d`	✓	✓
`nn.MultiheadAttention`	✘	✓
`MergedLinear`	✓ (Error)	✓	mergedlinear.ipynb
$\cdots$	hard to extend	easy to extend

Quick Start

The usage of AdapterLoRa

Install AdapterLoRa.

 pip install git+https://github.com/Baijiong-Lin/LoRA-Torch

pip install AdapterLoRa

Usage Tool AdpaterLoRa

import torch.nn as nn
import torch
from core.Quantized import AdapterLoRa

model = nn.TransformerEncoderLayer(d_model=512, nhead=8)

Adpate_model = AdapterLoRa(model , method="LoRa", Rank=4)

"""
adding Linear Layer built Self.attention 
Replace the layers where you would like to use AdapterLoRa by using  add_layer function.
"""

Adpate_model.add_layer("self_attn") 
Adpate_model.add_layer("linear1")
Adpate_model.add_layer("linear2")

# reconstruct model Quantized 
Adpate_model.reconstruct_model()

# Iplmented LoRa Method
model = Adpate_model.implement_lora(verbose=True)
# Total trainable parameters before LoRA: 3176960
# Total trainable parameters after LoRA: 24576

# This sets requires_grad to False for all parameters without the string "lora_" in their names

# Training loop
for batch in dataloader:
    model.train()

Saving Wieghts model

Save LoRA model (only the LoRA matrixes will be saved).

import loralib as lora 
# ===== Before =====
# torch.save(model.state_dict(), checkpoint_path)
# ===== After =====
torch.save(lora.lora_state_dict(model), checkpoint_path)

Loading the Pre-Trained Model

Load LoRA model (need to load the pre-trained model first).

import loralib as lora 
# Load the pre-trained checkpoint first
model.load_state_dict(torch.load('ckpt_pretrained.pt'), strict=False)
# Then load the LoRA checkpoint
model.load_state_dict(torch.load('ckpt_lora.pt'), strict=False)

Quantized Model
Time to Train
Cost to Train

What's in it for you?

For each of the above four pillars, we are sharing our codebase and insights to:

Assist you to leverage Transfomer-Based Model for your machines needs and challenges
Boost reproducibility efforts which are becoming increasingly difficult with Transfomers

i am providing Tool that are ready-to-use for Quantize the model:

Finetuning Transfomer-Based on your proprietary dataset via PeFT methodologies such as LoRA and QLoRa
Performing hyperparameter optimization to get the maximum performance out of these models

What's the best way to use this repository?

Go over to the Transfomer-Based-specific directory that you are interested in, and open the README.md. We have included details about the LLMs, followed by performance results on open-source datasets!

Methods Supports Quantization

the supports method for Quantize the Transfomer-Based Models

LoRa
LoRaTorch
QLoRA

Roadmap

Our plan is to perform these experiments on all the Transformer-Based model below. To that end, this is a tentative roadmap of the LLMs that we aim to cover:

Correspondence

Contributor

AdapterLoRa is developed and maintained by ''Youness ELbrag'' (Email | LinkedIn)

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github/workflows		.github/workflows
assets		assets
core		core
exmpales		exmpales
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Adapter-LoRa for Quantization

Features

Mathematical Formulation

Usage

Supported Layers

Quick Start

Usage Tool AdpaterLoRa

Saving Wieghts model

Loading the Pre-Trained Model

What's in it for you?

What's the best way to use this repository?

Methods Supports Quantization

Roadmap

Correspondence

Contributor

About

Releases

Packages

Languages

License

youness-elbrag/AdapterLoRa

Folders and files

Latest commit

History

Repository files navigation

Adapter-LoRa for Quantization

Features

Mathematical Formulation

Usage

Supported Layers

Quick Start

Usage Tool AdpaterLoRa

Saving Wieghts model

Loading the Pre-Trained Model

What's in it for you?

What's the best way to use this repository?

Methods Supports Quantization

Roadmap

Correspondence

Contributor

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages