posts/2024/importing-yi9b-to-ollama/ #8
Replies: 1 comment 1 reply
-
我是LLM领域的菜鸟,也从未接错过机器学习、深度学习这些内容。 最近,我开始尝试了解LLM以及如何微调他们。过程中,我得知这些大模型通常会以 通过搜索,我得知需要对它进行SFT微调,我已经完成了这方面的工作,只不过模型没上传到HuggingFace等公共平台上,因为我发现的使用LLaMA-Factory进行LoRA微调时,学习率设置太大导致模型训练效果没有收敛。 效果如何?坦率的讲,效果很差,但可以像和ChatGPT对话那样进行交流了,不过有些时候它还是表现的像个Base模型那样自说自话。我注意到你最新更新的版本已经可以完成对话了,不知道你后面是否还进行了其他工作?期待您的指点。 下面是我的脚本: # SFT微调, 让模型可以进行Chat任务
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
--stage sft \
--do_train True \
--model_name_or_path /mnt/workspace/LLaMA-Factory/Yi-9B-200K \
--finetuning_type lora \
--quantization_bit 4 \
--template yi \
--dataset_dir data \
--dataset belle_2m \
--cutoff_len 1024 \
--learning_rate 0.0002 \
--num_train_epochs 3.0 \
--max_samples 20000 \
--per_device_train_batch_size 6 \
--gradient_accumulation_steps 1 \
--lr_scheduler_type cosine \
--max_grad_norm 1.0 \
--logging_steps 5 \
--save_steps 100 \
--warmup_steps 50 \
--neftune_noise_alpha 5 \
--optim adamw_torch \
--packing True \
--report_to none \
--output_dir saves/Yi-9B/lora/yi-9b-200k-chat-lora \
--fp16 True \
--lora_rank 8 \
--lora_alpha 16 \
--lora_dropout 0.1 \
--lora_target q_proj,v_proj \
--plot_loss True
# 命令行试用模型, 用于测试模型是否可以正常工作
CUDA_VISIBLE_DEVICES=0 python src/cli_demo.py \
--model_name_or_path /mnt/workspace/LLaMA-Factory/Yi-9B-200K \
--adapter_name_or_path saves/Yi-9B/lora/yi-9b-200k-chat-lora/ \
--template yi \
--quantization_bit 4 \
--finetuning_type lora
# 对模型进行评分, 执行失败, A10 的显存不足
CUDA_VISIBLE_DEVICES=0 python src/evaluate.py \
--model_name_or_path /mnt/workspace/LLaMA-Factory/Yi-9B-200K \
--adapter_name_or_path saves/Yi-9B/lora/yi-9b-200k-chat-lora/ \
--template yi \
--quantization_bit 4 \
--finetuning_type lora \
--task mmlu \
--split test \
--lang zh \
--n_shot 5 \
--batch_size 4
# 对模型进行评分, 执行成功, 效果如下
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
--stage sft \
--model_name_or_path /mnt/workspace/LLaMA-Factory/Yi-9B-200K \
--adapter_name_or_path saves/Yi-9B/lora/yi-9b-200k-chat-lora/ \
--finetuning_type lora \
--quantization_bit 4 \
--template yi \
--dataset_dir data \
--dataset alpaca_gpt4_zh \
--cutoff_len 1024 \
--max_samples 2000 \
--per_device_eval_batch_size 16 \
--predict_with_generate True \
--max_new_tokens 128 \
--top_p 0.7 \
--temperature 0.95 \
--output_dir saves/Yi-9B/lora/yi-9b-200k-chat-lora/ \
--do_predict True
***** predict metrics *****
predict_bleu-4 = 12.0712
predict_rouge-1 = 34.153
predict_rouge-2 = 12.641
predict_rouge-l = 23.7601
predict_runtime = 0:38:24.18
predict_samples_per_second = 0.868
predict_steps_per_second = 0.054
# 合并模型
# DO NOT use quantized model or quantization_bit when merging lora weights
CUDA_VISIBLE_DEVICES=0 python src/export_model.py \
--model_name_or_path /mnt/workspace/LLaMA-Factory/Yi-9B-200K \
--adapter_name_or_path saves/Yi-9B/lora/yi-9b-200k-chat-lora/ \
--template yi \
--finetuning_type lora \
--export_dir saves/Yi-9B/lora/yi-9b-200k-chat-lora/models \
--export_size 4 \
--export_legacy_format False
# 对模型进行GPTQ 4bit量化, 执行失败, 显存不足
#!/bin/bash
CUDA_VISIBLE_DEVICES=0 python src/export_model.py \
--model_name_or_path saves/Yi-9B/lora/yi-9b-200k-chat-lora/models \
--template yi \
--export_dir saves/Yi-9B/lora/yi-9b-200k-chat-lora-int4/models \
--export_quantization_bit 4 \
--export_quantization_dataset data/c4_demo.json \
--export_size 1 \
--export_legacy_format False |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
posts/2024/importing-yi9b-to-ollama/
The log of importing Yi-9B LLM model to Ollama library.
https://shinyzhu.com/posts/2024/importing-yi9b-to-ollama/
Beta Was this translation helpful? Give feedback.
All reactions