Replies: 4 comments 1 reply
-
@dalistarh I've trained a ResMLP from scratch, got close to paper results. My gMLP attempt failed, probably due to init issues, have improved that but yet to rerun. The hparams were based on the gMLP paper, which is basically DeiT hparams minus 'repeated aug' https://github.com/facebookresearch/deit |
Beta Was this translation helpful? Give feedback.
-
@dalistarh https://gist.github.com/rwightman/d6c264a9001f9167e06c209f630b2cc6 |
Beta Was this translation helpful? Give feedback.
-
I have a question for ResMLP in Machine translation. The cross patch FC's dimension depends on the fixed size of the input images in CV, but in MT, the sequence length is not fixed, how could we solve it or what do we do during the training and inference to solve this problem? |
Beta Was this translation helpful? Give feedback.
-
Hi,
Thanks for your useful work!
I would like to fine-tune the MLP-based models (gMLP and ResMLP), and I was wondering if you could please provide some additional details on training parameters, in particular:
I plan to do training on GPUs.
Many thanks,
Dan
Beta Was this translation helpful? Give feedback.
All reactions