Why the training time is so long #7

mid2doubao · 2024-03-05T14:24:28Z

I use the command below with two NVIDIA TITAN RTXs, it needs 20+ hours to get the model trained.
python main.py --global_model 'chavinlo/alpaca-native'
--data_path "./data"
--output_dir './lora-shepherd-7b/'
--num_communication_rounds 10
--num_clients 10
--train_on_inputs
--group_by_length

lxr-1204 · 2024-03-09T12:52:03Z

I trained using the same code as you, using an RTX 3090 24G for training. It took approximately 14 hours, and the GPU memory usage was around 14G, not the 23G mentioned in the paper. May I ask about the GPU memory usage of your system?
Can the settings provided by the author in the paper be directly translated into usable code? Can you reproduce the author's results?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why the training time is so long #7

Why the training time is so long #7

mid2doubao commented Mar 5, 2024

lxr-1204 commented Mar 9, 2024

Why the training time is so long #7

Why the training time is so long #7

Comments

mid2doubao commented Mar 5, 2024

lxr-1204 commented Mar 9, 2024