-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when using DPOTrainer with Multiple GPUs and 8-bit Precision Models #659
Comments
Having the active and ref model on different GPUs is not supported as far as I know. This would lead to all sorts of additional issues as we would need to move tensors around. However with #640 you should be able to just load one model and activate/deactivate the adapters to switch between active/reference model. Hope this helps! |
Thank you for your prompt response! After installing I have now adjusted the model to load in 4-bit, and I've successfully managed to create a bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
model = AutoModelForCausalLM.from_pretrained(
local_model_path,
quantization_config=bnb_config,
trust_remote_code=True,
)
peft_config = LoraConfig(
r=lora_r,
lora_alpha=lora_alpha,
lora_dropout=lora_dropout,
target_modules=[
"q_proj",
"v_proj",
"k_proj",
"out_proj",
"fc_in",
"fc_out",
"wte",
],
bias="none",
task_type="CAUSAL_LM",
)
dpo_trainer = DPOTrainer(
model,
args=training_args,
beta=beta,
train_dataset=train_dataset,
tokenizer=tokenizer,
peft_config=peft_config,
max_prompt_length=max_prompt_length,
max_length=max_length,
) However, upon executing
With setting the environment variable
|
Could be related to #480 @younesbelkada ? |
@munhouiani do you have gradient checkpointing activated? If yes, you can try disabling it to bypass the above error (at the expense of needing ~2x more vRAM) |
Description:
Hi there,
I've encountered an issue while attempting to utilize the DPO training process on an AWS g5 instance equipped with 4 A10 GPUs. My training setup closely follows the procedure outlined in the dpo_llama2.py script. However, I deviated from the script by employing the Llama-2-7B-Chat model rather than the SFT model with PEFT.
The models are loaded using the following code snippet:
I explicitly loaded the models onto two separate GPUs, as they are too large to fit within a single A10 GPU.
However, upon attempting to create a
DPOTrainer
instance, I encountered the following error message:Query:
What would be the correct approach to implementing DPO with multiple GPUs?
Additionally, I've included a list of the installed packages, with
Thank you for your assistance.
The text was updated successfully, but these errors were encountered: