Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Questions about training] #2

Open
tomguluson92 opened this issue Nov 11, 2024 · 5 comments
Open

[Questions about training] #2

tomguluson92 opened this issue Nov 11, 2024 · 5 comments

Comments

@tomguluson92
Copy link
Contributor

Dear authors,

Thanks for your brilliant works, here are some problems occurred during my test:

  1. --fixed_time_steps 1 2 5 10 what should I use, 1, 2, 5 or 10? and what is the meaning of those this term?
  2. what is the dataset of irt and tgt? How to gen that?
@Hongcheng-Gao
Copy link
Member

Hongcheng-Gao commented Nov 12, 2024

Thank you for your interest in our work.

  1. In our current paper, we did not fix the timestep sampling during meta-unlearning (fix_timesteps=False). However, we found that since the optimization of meta-unlearning randomly samples a timestep each time, this led to very long optimization times and considerable randomness in the performance. Therefore, to make optimization more stable and achieve faster convergence, we implemented a strategy of using fixed timesteps during meta training. The fixed_time_steps parameter is used to set these fixed timesteps in the meta component. We are currently exploring whether smaller or larger timesteps work better, and if using more timesteps would be beneficial. We will update the specific timestep selection strategy soon.

  2. Additionally, the irt dataset refers to concepts unrelated to the concept being unlearned, while the target (tgt) dataset refers to concepts related to the concept being unlearned. For example, if you want to unlearn nudity-related content, the tgt dataset would contain image-text pairs like women/skin, while the irt dataset would contain completely unrelated content from "nudity", like dogs/cats. All of them can be generated by gen_images.py.

We will be updating the README and code soon to make everything clearer. Thank you for your attention.

@tomguluson92
Copy link
Contributor Author

Thanks for the feedbacks, should this runs well? it takes nearly 28 A100/H100 GPU hours to finish a training?

image

@Hongcheng-Gao
Copy link
Member

Hi, congratulate that the program in your screenshot is running normally. However, we noticed from the screenshots that you have fixed_time_step enabled. When using fixed_time_step, there's no need to train for 1500 steps, as the 1500-step limit was originally set as the maximum training steps for cases where fixed_time_step=True is not used. 100-200 step may be enough for fixed_time_step=True.

Also, since we're still in the ablation study phase for cases with fixed_time_step=True, the current default fixed timesteps (1, 2, 5, 10) may not lead to good performance. We're currently explore which specific timestep values would yield good performance.

@tomguluson92
Copy link
Contributor Author

So I should change fixed_time_step=False and 200 steps?

@Hongcheng-Gao
Copy link
Member

100-200 step may be enough for fixed_time_step=True. If you use fixed_time_step=False, this should be 1000-1500 steps. The better choice is fixed_time_step=False and 1000-1500 steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants