xtuner 微调报错，求指点 #260

zhanbaohang · 2024-09-20T09:16:12Z

微调的时候出现这个问题，请问大佬们如何解决

(base) root@91bd5febc58b:/data/nlp_translate# CUDA_VISIBLE_DEVICES=2,3 NPROC_PER_NODE=2 xtuner train /data/nlp_translate/train_for_internlm/internlm2_5_chat_7b_qlora.py --deepspeed deepspeed_zero2
09/20 09:09:17 - mmengine - WARNING - Use random port: 29768
/opt/conda/lib/python3.8/site-packages/mmengine/optim/optimizer/zero_optimizer.py:11: DeprecationWarning: TorchScript support for functional optimizers is deprecated and will be removed in a future PyTorch release. Consider using the torch.compile optimizer instead.
from torch.distributed.optim import
[2024-09-20 09:09:22,245] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH
[WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.4
[WARNING] using untested triton version (3.0.0), only 1.0.0 is known to be compatible
/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/zero/linear.py:49: FutureWarning: torch.cuda.amp.custom_fwd(args...) is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda') instead.
def forward(ctx, input, weight, bias=None):
/opt/conda/lib/python3.8/site-packages/deepspeed/runtime/zero/linear.py:67: FutureWarning: torch.cuda.amp.custom_bwd(args...) is deprecated. Please use torch.amp.custom_bwd(args..., device_type='cuda') instead.
def backward(ctx, grad_output):
:219: RuntimeWarning: scipy._lib.messagestream.MessageStream size changed, may indicate binary incompatibility. Expected 56 from C header, got 64 from PyObject
09/20 09:09:26 - mmengine - WARNING - WARNING: command error: 'partially initialized module 'cv2' has no attribute 'gapi_wip_gst_GStreamerPipeline' (most likely due to a circular import)'!
09/20 09:09:26 - mmengine - WARNING -
Arguments received: ['xtuner', 'train', '/data/nlp_translate/train_for_internlm/internlm2_5_chat_7b_qlora.py', '--deepspeed', 'deepspeed_zero2']. xtuner commands use the following syntax:

    xtuner MODE MODE_ARGS ARGS

    Where   MODE (required) is one of ('list-cfg', 'copy-cfg', 'log-dataset', 'check-custom-dataset', 'train', 'test', 'chat', 'convert', 'preprocess', 'mmbench', 'eval_refcoco')
            MODE_ARG (optional) is the argument for specific mode
            ARGS (optional) are the arguments for specific command

Some usages for xtuner commands: (See more by using -h for specific command!)

    1. List all predefined configs:
        xtuner list-cfg
    2. Copy a predefined config to a given path:
        xtuner copy-cfg $CONFIG $SAVE_FILE
    3-1. Fine-tune LLMs by a single GPU:
        xtuner train $CONFIG
    3-2. Fine-tune LLMs by multiple GPUs:
        NPROC_PER_NODE=$NGPUS NNODES=$NNODES NODE_RANK=$NODE_RANK PORT=$PORT ADDR=$ADDR xtuner dist_train $CONFIG $GPUS
    4-1. Convert the pth model to HuggingFace's model:
        xtuner convert pth_to_hf $CONFIG $PATH_TO_PTH_MODEL $SAVE_PATH_TO_HF_MODEL
    4-2. Merge the HuggingFace's adapter to the pretrained base model:
        xtuner convert merge $LLM $ADAPTER $SAVE_PATH
        xtuner convert merge $CLIP $ADAPTER $SAVE_PATH --is-clip
    4-3. Split HuggingFace's LLM to the smallest sharded one:
        xtuner convert split $LLM $SAVE_PATH
    5-1. Chat with LLMs with HuggingFace's model and adapter:
        xtuner chat $LLM --adapter $ADAPTER --prompt-template $PROMPT_TEMPLATE --system-template $SYSTEM_TEMPLATE
    5-2. Chat with VLMs with HuggingFace's model and LLaVA:
        xtuner chat $LLM --llava $LLAVA --visual-encoder $VISUAL_ENCODER --image $IMAGE --prompt-template $PROMPT_TEMPLATE --system-template $SYSTEM_TEMPLATE
    6-1. Preprocess arxiv dataset:
        xtuner preprocess arxiv $SRC_FILE $DST_FILE --start-date $START_DATE --categories $CATEGORIES
    6-2. Preprocess refcoco dataset:
        xtuner preprocess refcoco --ann-path $RefCOCO_ANN_PATH --image-path $COCO_IMAGE_PATH --save-path $SAVE_PATH
    7-1. Log processed dataset:
        xtuner log-dataset $CONFIG
    7-2. Verify the correctness of the config file for the custom dataset:
        xtuner check-custom-dataset $CONFIG
    8. MMBench evaluation:
        xtuner mmbench $LLM --llava $LLAVA --visual-encoder $VISUAL_ENCODER --prompt-template $PROMPT_TEMPLATE --data-path $MMBENCH_DATA_PATH
    9. Refcoco evaluation:
        xtuner eval_refcoco $LLM --llava $LLAVA --visual-encoder $VISUAL_ENCODER --prompt-template $PROMPT_TEMPLATE --data-path $REFCOCO_DATA_PATH
    10. List all dataset formats which are supported in XTuner

Run special commands:

    xtuner help
    xtuner version

GitHub: https://github.com/InternLM/xtuner

The text was updated successfully, but these errors were encountered:

KMnO4-zx · 2024-10-04T03:23:19Z

可以去xtuner的官方仓库提交issue哦，哪里有负责解答~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xtuner 微调报错，求指点 #260

xtuner 微调报错，求指点 #260

zhanbaohang commented Sep 20, 2024

KMnO4-zx commented Oct 4, 2024

xtuner 微调报错，求指点 #260

xtuner 微调报错，求指点 #260

Comments

zhanbaohang commented Sep 20, 2024

KMnO4-zx commented Oct 4, 2024