Split learning LLM example. #366

dixiyao · 2023-11-08T00:48:20Z

Description

Added three more API functions in the plato/trainer/split_learning: forward_to_intermediate_feature, update_weights_before_cut, test_model_split_leaffect. Revised the API for better design such as avoiding possible memory leakage.
The revision will not affect previous split learning examples.
Provide an example of fine-tuning Huggingface (GPT-2, OPT) with the split learning. First, the split_learning_llm_model.py includes the client model and the server model the contain layers on the client or on the server. The split_learning_trainer.py shows how we can use the current split learning API to train LLM with split learning. We directly use the huggingface trainer during the testing phase.
The LoRA fine-tuning is supported by using LoRA model in split_learning_llm_model.py and the LoRA algorithm for split learning split_learning_lora_algorithm.py.

How has this been tested?

To test the GPT2

python ./examples/split_learning/llm_split_learning/split_learning_main.py -c ./examples/split_learning/llm_split_learning/split_learning_wikitext2_gpt2.yml

To test the OPT

python ./examples/split_learning/llm_split_learning/split_learning_main.py -c ./examples/split_learning/llm_split_learning/split_learning_wikitext2_opt350m.yml

To test the LoRA GPT2

python ./examples/split_learning/llm_split_learning/split_learning_main.py -c ./examples/split_learning/llm_split_learning/split_learning_wikitext2_gpt2_lora.yml

Types of changes

Bug fix (non-breaking change which fixes an issue) Fixes #
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

My code has been formatted using Black and checked using PyLint.
My change requires a change to the documentation.
I have updated the documentation accordingly.

netlify · 2023-11-08T00:48:25Z

✅ Deploy Preview for platodocs canceled.

Name	Link
🔨 Latest commit	`5f48c7f`
🔍 Latest deploy log	https://app.netlify.com/sites/platodocs/deploys/654d96286c6e90000958a005

examples/split_learning/llm_split_learning/split_learning_trainer.py

…ainer.

dixiyao added 15 commits November 7, 2023 13:31

added LLM split learning using current APIs.

620b7b9

renamed names of functions and configuration variables.

dadd62f

cleaned up the trainer.

4988d0b

Added two more API functions in split learning trainer.

90158ae

fixed a bug related to config.

af2b55b

added self.training args. and fixed a bug in copy_weight.

4bb06d5

fixed a bug related to gradients.

4a9dc40

moved loading datasource to the init in the server split learning.

6f5704b

changed name of configuration file.

91461ba

updated the docs/examples.md about LLM split learning.

fe7c91c

added support for other LLMs in Huggingface.

9a7fd67

resolved issues raised by / in model name.

14c81fd

revised the make the name matching as the completely matching.

37ab27a

added support for Llama2 by passing use_auth_token.

c4a586d

added llama2 example.

9755c99

dixiyao added 4 commits November 7, 2023 21:26

added support for LoRA model.

da8e79b

fixed a bug in loading lora model.

72d9202

extract and load lora weights onnly.

fb7f49e

updated the examples.md about using LoRA to finetune.

9ef9253

dixiyao marked this pull request as draft November 8, 2023 16:02

dixiyao marked this pull request as ready for review November 8, 2023 17:09

dixiyao added 2 commits November 8, 2023 13:49

revised grammar and spelling in examples.md

ed988d2

fixed a bug in calculating the accuracy.

913ac63

baochunli requested review from HeyHao and silviafeiwang November 8, 2023 23:07

Simplified split learning main function.

b7821b3

HeyHao reviewed Nov 9, 2023

View reviewed changes

examples/split_learning/llm_split_learning/split_learning_trainer.py Outdated Show resolved Hide resolved

dixiyao added 2 commits November 8, 2023 21:00

removed useless save_metrics.

d3e79b9

used the checkpoint path as the temporary path for huggingface trainer.

b57ebc7

dixiyao and others added 3 commits November 8, 2023 21:14

use the checkpoint path as the temporary path for the Hugging Face tr…

e5092df

…ainer.

Check spelling in comments and names

04d35d5

Add libraries needed into requirement.txt

5f48c7f

baochunli merged commit 3f55ba7 into main Nov 10, 2023
6 checks passed

dixiyao deleted the SplitLearningLLM branch November 10, 2023 23:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split learning LLM example. #366

Split learning LLM example. #366

dixiyao commented Nov 8, 2023 •

edited

Loading

netlify bot commented Nov 8, 2023 •

edited

Loading

Split learning LLM example. #366

Split learning LLM example. #366

Conversation

dixiyao commented Nov 8, 2023 • edited Loading

Description

How has this been tested?

Types of changes

Checklist:

netlify bot commented Nov 8, 2023 • edited Loading

✅ Deploy Preview for platodocs canceled.

dixiyao commented Nov 8, 2023 •

edited

Loading

netlify bot commented Nov 8, 2023 •

edited

Loading