Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Remove deprecated push_to_hub_token to resolve warning #419

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

Luka-D
Copy link
Contributor

@Luka-D Luka-D commented Dec 13, 2024

Description of the change

Running tuning locally produces this warning message:

FutureWarning: `--push_to_hub_token` is deprecated and will be removed in version 5 of 🤗 Transformers. Use `--hub_token` instead.

When I print out the transformer_kwargs dictionary, I can see that both hub_token and push_to_hub_token are in the dictionary. Here is a snippet of it.

.....'hub_token': '<HUB_TOKEN>', 'hub_private_repo': False, 'hub_always_push': False, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': None, 'include_inputs_for_metrics': False, 'eval_do_concat_batches': True, 'fp16_backend': 'auto', 'evaluation_strategy': None, 'push_to_hub_model_id': None, 'push_to_hub_organization': None, 'push_to_hub_token': '<PUSH_TO_HUB_TOKEN>',......

I made a temporary fix by simply popping push_to_hub_token out of the dictionary and now the warning is gone.

# Remove deprecated push_to_hub_token
transformer_kwargs.pop("push_to_hub_token", None)

I am wondering if anyone knows where push_to_hub_token is originating from or best way to remove it.

Related issue number

Issue #1205

How to verify the PR

Run tuning locally:

python3 tuning/sft_trainer.py \
--model_name_or_path Maykeye/TinyLLama-v0 \
--training_data_path tests/artifacts/testdata/jsonl/twitter_complaints_small.jsonl \
--output_dir outputs/lora-tuning \
--num_train_epochs 5 \
--per_device_train_batch_size 4 \
--gradient_accumulation_steps 4 \
--learning_rate 1e-5 \
--response_template "\n### Label:" \
--dataset_text_field "output" \
--use_flash_attn false \
--torch_dtype "float32" \
--peft_method "lora" \
--r 8 \
--lora_dropout 0.05 \
--lora_alpha 16  \
--log_level "error"

Copy link

Thanks for making a pull request! 😃
One of the maintainers will review and advise on the next steps.

@github-actions github-actions bot added the fix label Dec 13, 2024
@Luka-D
Copy link
Contributor Author

Luka-D commented Jan 7, 2025

Summary of changes made with 9e567e2

It seems that using .to_dict() method on train_args was returning the hub_token and push_to_hub_token fields, which had a default value like <PUSH_TO_HUB_TOKEN>. Changing it to train_args.__dict__.items() has removed some of those default values.

Because of the change to train_args.__dict__.items(), I needed to change SFTConfig fields function to only return parameters that are accepted by SFTConfig. Previously it was also returning fields like _n_gpu, which is inherited from TrainingArguments class but is not accepted in SFTConfig and will raise an error. Using SFTConfig.__dict__["__match_args__"] will return the same parameters as those listed in the variables window on HF: https://huggingface.co/docs/trl/en/sft_trainer#trl.SFTConfig.

In summary, the changes to transformer_kwargs are:

  • hub_token and push_to_hub_token are now both None, instead of <hub_token> and <push_to_hub_token>
  • Some of the parameters have changed from strings into Object types. eg. 'lr_scheduler_type': 'linear' has become 'lr_scheduler_type': <SchedulerType.LINEAR: 'linear'>. 'save_strategy': 'epoch' has become 'save_strategy': <IntervalStrategy.EPOCH: 'epoch'> etc.
  • I do not know if this change to Object types is the desired result, would like some feedback on this.

All code has passed tox -e py tests.

Please let me know your thoughts, thank you.

@Luka-D Luka-D marked this pull request as ready for review January 7, 2025 19:12
@Luka-D Luka-D marked this pull request as draft January 7, 2025 19:13
@@ -346,10 +345,10 @@ def train(
# from our object directly. In the future, we should consider renaming this class and / or
# not adding things that are not directly used by the trainer instance to it.

transformer_train_arg_fields = [x.name for x in dataclasses.fields(SFTConfig)]
transformer_train_arg_fields = SFTConfig.__dict__["__match_args__"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use of __match_args__ requires Python>=3.10 isn't it? Correct me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe it is Python 3.10 and greater. Although I'm a little confused on whether __match_args__ are auto generated by Python when creating a class or manually set by those writing the class. As it is now, it is a key in the dictionary, so I wonder if the key would still exist in Python 3.9 and lower.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should avoid usage of __match_args__ in that case, WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess so, which version of Python is the lowest supported by fms-hf-tuning repo?

Also, if we remove __match_args__ do you know another way to get all the input variables for SFTConfig as listed here? Using __dict__.keys() only returns the 12 parameters, but I need all of the variables that SFTConfig inherits from TrainingArguments class. For example, output_dir, do_train, do_eval etc.

Signed-off-by: Luka Dojcinovic <[email protected]>
@Luka-D
Copy link
Contributor Author

Luka-D commented Jan 22, 2025

I made small change with c7d540e. I changed __dict__ to vars(). The functionality is the same for both, vars() accesses the __dict__ method, however using vars() is more readable and more pythonic in my opinion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants