-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
a question about meta-training strategy #45
Comments
Hi, What do you mean by “training strategy”? Do you mean that we introduce “pre-training” phase? Best, |
I mean meta training phase. In maml's outer loop, the loss which update model's params is all tasks'(100 training task) loss' sum. In each outer loop epoch,model's param update only once . however ,in your torch version, in the outer loop phase, the loss which update model's params is every task's loss. in each outer loop epoch, it update 100 times(training task num). This pic may can explain more clearly. |
I think you misunderstand MAML. MAML doesn't use all tasks' losses to update the model in the outer loop. Our MTL uses a similar meta-training strategy as MAML. Your figure doesn't show the correct strategy applied in MAML. In MAML, they use the "meta-batch" strategy, i.e., using the average loss of 4 tasks to update one outer loop iteration. In our method, we just set the number of "meta-batch" to 1. |
oh i see. thank you every much. And could you tell me why you set the meta-batch to 1? what's the meaning of meta-batch? |
If the meta-batch size is 4, in one outer loop iteration, the model will be updated by the average loss of 4 different tasks. |
well... thank you~ |
No problem. |
I think your figure is correct. But Besides, |
oh you are right . i misunderstand this figure. |
@Sword-keeper Hello, I agree with u. And also thank the authors for their useful reply. |
Hi @LavieLuo, Thanks for your interest in our work. If you have any further questions, please send me an email or add comments on this issue. Best, |
@yaoyao-liu Woo, thank you for this prompt reply. Now I completely understand the motivation of this strategy. That's cool! :) |
@LavieLuo In my experience, if the base-learner overfits the training samples of the target task, the performance won't drop. So I just update the FC layer as many times as I can to make it overfitting. |
@yaoyao-liu Yes, I agree! I remember some recent works show the overfitting of DNNs manifests in probabilistic (over-confidence) which somehow doesn‘t degrade the accuracy. Also, I forget that MTL only trains a part of the parameters, and now I figure it out. Thanks again! |
@LavieLuo |
Hi, when i read your code. i noticed that your meta-training strategy have some differences with MAML. Could you tell me which meta-learning paper design this strategy ? Or it is your design? Besides, what's the reason you choose this strategy?
The text was updated successfully, but these errors were encountered: