Skip to content

Latest commit

 

History

History
53 lines (33 loc) · 2.71 KB

Demo.md

File metadata and controls

53 lines (33 loc) · 2.71 KB

Demo, Example, Comparing

expand LoRA to Convolution Layer

lora for transformer only vs lora for whole model

Yog-Sothoth LoRA/LoCon: https://civitai.com/models/14878/loconlora-yog-sothoth-depersonalization

LoRA rank=1: 00159 LoCon rank=1 00164

xy grid: image

With some experiments from community, finetuning whole model can learn "More". But in some case this means it will overfit heavily.


Hadamard product vs Conventional

We introduced LoRA finetuning with Hadamard Product representation from FedPara. And based on this experiments, LoHa with same size (and dim>2, or rank>4) can get better result in some situation.

Image

This is some comments from the experiments:

Why LoHa?

The following comparison should be quite convincing.

(Left 4 LoHa with different number of training steps; Right 4 same thing but for LoCon; same seed same training parameters for the two training)


In addition to the five characters I also train a bunch of style into the model (what I have always been doing actually).

However, LoRa and LoCon do not combine styles with characters that well (characters are only trained with anime images) and this capacity gets largely improved in LoHa.

Note that the two files have almost the same size (around 30mb). For this I set (linear_dim, conv_dim) to (16,8) for LoCon and (8,4) for LoHa. However with Hadamard product the resulting matrix could now be of rank 8x8=64 for linear layers and 4x4=16 for convolutional layers.

And there is more example about LoHA vs LoCon in same file size. (diff < 200KB) xyz_grid-0330-2023-03-08_d4d1ef62c3_KBlueLeaf_KBlueLeaf-v1 1_4114224331_c2af89d9-2160x2789 xyz_grid-0324-2023-03-08_d4d1ef62c3_KBlueLeaf_KBlueLeaf-v1 1_2496183095_a91c0526-2160x2789

xyz_grid-0300-2023-03-08_b38775f1cf_download_谈秋-v2_3652865965_ed37959c-2640x2047 xyz_grid-0296-2023-03-08_d4d1ef62c3_KBlueLeaf_KBlueLeaf-v1 1_4001890147_ed37959c-2640x2047