a question about meta-training strategy #45

Sword-keeper · 2021-01-14T09:40:35Z

Hi, when i read your code. i noticed that your meta-training strategy have some differences with MAML. Could you tell me which meta-learning paper design this strategy ? Or it is your design? Besides, what's the reason you choose this strategy?

yaoyao-liu · 2021-01-14T09:44:29Z

Hi,

What do you mean by “training strategy”? Do you mean that we introduce “pre-training” phase?

Best,
Yaoyao

Sword-keeper · 2021-01-14T12:08:59Z

I mean meta training phase. In maml's outer loop, the loss which update model's params is all tasks'(100 training task) loss' sum. In each outer loop epoch,model's param update only once . however ,in your torch version, in the outer loop phase, the loss which update model's params is every task's loss. in each outer loop epoch, it update 100 times(training task num). This pic may can explain more clearly.

yaoyao-liu · 2021-01-14T12:30:08Z

I think you misunderstand MAML.

MAML doesn't use all tasks' losses to update the model in the outer loop. Our MTL uses a similar meta-training strategy as MAML. Your figure doesn't show the correct strategy applied in MAML.

In MAML, they use the "meta-batch" strategy, i.e., using the average loss of 4 tasks to update one outer loop iteration. In our method, we just set the number of "meta-batch" to 1.

Sword-keeper · 2021-01-14T12:38:02Z

oh i see. thank you every much. And could you tell me why you set the meta-batch to 1? what's the meaning of meta-batch?

yaoyao-liu · 2021-01-14T12:41:07Z

If the meta-batch size is 4, in one outer loop iteration, the model will be updated by the average loss of 4 different tasks.
I set the meta-batch size to 1 because it will be easier to implement it...

Sword-keeper · 2021-01-14T12:42:10Z

well... thank you~

yaoyao-liu · 2021-01-14T12:42:24Z

No problem.

yaoyao-liu · 2021-01-14T12:45:12Z

I think your figure is correct. But n is not 100. It is 4 in the different settings of MAML.

Besides, n is not the number of all tasks. In MAML, we can sample e.g., 10000 tasks. The four tasks in one meta-batch are sampled from the 10000 tasks.

Sword-keeper · 2021-01-14T12:56:49Z

oh you are right . i misunderstand this figure.

LavieLuo · 2021-03-18T09:19:51Z

@Sword-keeper Hello, I agree with u. And also thank the authors for their useful reply.
I guess the main difference between the MTL and MAML w.r.t. “training strategy” is the setting of meta_batch_size, where MAML is 4 and MTL is 1. Besides, I guess "update 100 times" means the parameter update_batch_size ($k$ in your figure) in MAML code, which is set as 5 while MTL is 100? I'm actually also puzzled about this. (e.g., line 101 in meta-transfer-learning/pytorch/trainer/pre.py )
for _ in range(1, self.update_step):

yaoyao-liu · 2021-03-18T10:13:21Z

Hi @LavieLuo,

Thanks for your interest in our work.
In MAML, they update all network parameters during base-learning 5 times.
In our MTL, we update the FC layer during base-learning 100 times.
As we update a minimal number of parameters compared to MAML, we can update them more times.

If you have any further questions, please send me an email or add comments on this issue.

Best,
Yaoyao

LavieLuo · 2021-03-18T10:30:10Z

@yaoyao-liu Woo, thank you for this prompt reply. Now I completely understand the motivation of this strategy. That's cool! :)

yaoyao-liu · 2021-03-18T10:58:09Z

@LavieLuo In my experience, if the base-learner overfits the training samples of the target task, the performance won't drop. So I just update the FC layer as many times as I can to make it overfitting.

LavieLuo · 2021-03-18T11:29:12Z

@yaoyao-liu Yes, I agree! I remember some recent works show the overfitting of DNNs manifests in probabilistic (over-confidence) which somehow doesn‘t degrade the accuracy. Also, I forget that MTL only trains a part of the parameters, and now I figure it out. Thanks again!

yaoyao-liu · 2021-03-18T12:00:27Z

@LavieLuo
No problem.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

a question about meta-training strategy #45

a question about meta-training strategy #45

Sword-keeper commented Jan 14, 2021

yaoyao-liu commented Jan 14, 2021

Sword-keeper commented Jan 14, 2021

yaoyao-liu commented Jan 14, 2021

Sword-keeper commented Jan 14, 2021

yaoyao-liu commented Jan 14, 2021

Sword-keeper commented Jan 14, 2021

yaoyao-liu commented Jan 14, 2021

yaoyao-liu commented Jan 14, 2021

Sword-keeper commented Jan 14, 2021

LavieLuo commented Mar 18, 2021

yaoyao-liu commented Mar 18, 2021

LavieLuo commented Mar 18, 2021

yaoyao-liu commented Mar 18, 2021

LavieLuo commented Mar 18, 2021

yaoyao-liu commented Mar 18, 2021

a question about meta-training strategy #45

a question about meta-training strategy #45

Comments

Sword-keeper commented Jan 14, 2021

yaoyao-liu commented Jan 14, 2021

Sword-keeper commented Jan 14, 2021

yaoyao-liu commented Jan 14, 2021

Sword-keeper commented Jan 14, 2021

yaoyao-liu commented Jan 14, 2021

Sword-keeper commented Jan 14, 2021

yaoyao-liu commented Jan 14, 2021

yaoyao-liu commented Jan 14, 2021

Sword-keeper commented Jan 14, 2021

LavieLuo commented Mar 18, 2021

yaoyao-liu commented Mar 18, 2021

LavieLuo commented Mar 18, 2021

yaoyao-liu commented Mar 18, 2021

LavieLuo commented Mar 18, 2021

yaoyao-liu commented Mar 18, 2021