You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I use the command below with two NVIDIA TITAN RTXs, it needs 20+ hours to get the model trained.
python main.py --global_model 'chavinlo/alpaca-native'
--data_path "./data"
--output_dir './lora-shepherd-7b/'
--num_communication_rounds 10
--num_clients 10
--train_on_inputs
--group_by_length
The text was updated successfully, but these errors were encountered:
I trained using the same code as you, using an RTX 3090 24G for training. It took approximately 14 hours, and the GPU memory usage was around 14G, not the 23G mentioned in the paper. May I ask about the GPU memory usage of your system?
Can the settings provided by the author in the paper be directly translated into usable code? Can you reproduce the author's results?
I use the command below with two NVIDIA TITAN RTXs, it needs 20+ hours to get the model trained.
python main.py --global_model 'chavinlo/alpaca-native'
--data_path "./data"
--output_dir './lora-shepherd-7b/'
--num_communication_rounds 10
--num_clients 10
--train_on_inputs
--group_by_length
The text was updated successfully, but these errors were encountered: