Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to tweak the adamw_torch optimizer to change Beta1 and Beta2 values? #743

Closed
jackswl opened this issue Aug 30, 2024 · 7 comments
Labels

Comments

@jackswl
Copy link

jackswl commented Aug 30, 2024

As per title,

  optimizer: adamw_torch

How can I tweak the beta1 and beta2 values in my .yml file?

@abhishekkrthakur wondering if this is possible. thanks!

@abhishekkrthakur
Copy link
Member

which task?

@jackswl
Copy link
Author

jackswl commented Aug 30, 2024

sorry, forgot to specify in details. It is for LLM fine-tuning

task: llm-sft
base_model: xxx
project_name: xxx
log: none
backend: local

data:
  path: xxx
  train_split: train
  valid_split: null
  chat_template: null
  column_mapping:
    text_column: text

params:
  block_size: 1024
  model_max_length: 1024
  epochs: 3
  batch_size: 4
  lr: 1e-4
  peft: true
  quantization: int4
  target_modules: "q_proj,v_proj,o_proj,k_proj"
  padding: right
  optimizer: adamw_torch
  scheduler: cosine
  gradient_accumulation: 4
  mixed_precision: bf16     
  warmup_ratio: 0.1
  weight_decay: 0.1
  lora_r: 16
  lora_alpha: 16
  lora_dropout: 0
  merge_adapter: false
  use_flash_attention_2: true  
  logging_steps: 1
  unsloth: false

hub:
  username: xx
  token: hf_XXX
  push_to_hub: false

@abhishekkrthakur could you kindly let me know, is there a way to tweak the beta1 and beta2 values for adamw_torch to 0.9 and 0.95 respectively for beta1 and beta2? I believe the default value is 0.9 and 0.999 respectively

@jackswl
Copy link
Author

jackswl commented Aug 31, 2024

@abhishekkrthakur any clue on this? thanks!

@abhishekkrthakur
Copy link
Member

i can add the params. for now they are hidden. is it required?

@jackswl
Copy link
Author

jackswl commented Sep 2, 2024

that may be good. I think the use case might be small, because I am using autotrain specifically for research. Thus, I would prefer to follow the standard convention which is 0.95 for beta2, instead of 0.999 for beta2. This follows the research community best practices. would this be possible?

facebookresearch/mae#184
https://docs.mosaicml.com/projects/composer/en/latest/api_reference/generated/composer.optim.DecoupledAdamW.html

Copy link

github-actions bot commented Oct 2, 2024

This issue is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale label Oct 2, 2024
Copy link

This issue was closed because it has been inactive for 20 days since being marked as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants