The influence of VAE feature dim #53

Tom-zgt · 2024-09-27T05:48:59Z

I'm currently following your excellent work MAR. I would like to know the impact of the VAE feature dimensions on model performance. I saw that you experimented with 16 and 8 dimensions features of VAE in the paper. Have you tried using 32 dimensions or larger dimensions? @LTH14

LTH14 · 2024-09-27T14:19:14Z

Thanks for your interest! Note that here KL-16 and KL-8 denote the downsampling stride of the tokenizer (KL-16 downsamples 256x256x3 image into 16x16x16 tokens, and KL-8 downsamples it into 32x32x4 tokens).

We don't have an ablation on this feature dimension in the paper. A higher VAE dimension typically improves reconstruction performance. However, we also found that the higher the VAE feature dimension, the harder it is for the simple DiffLoss to model it, so it is a trade-off.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The influence of VAE feature dim #53

The influence of VAE feature dim #53

Tom-zgt commented Sep 27, 2024 •

edited

Loading

LTH14 commented Sep 27, 2024 •

edited

Loading

The influence of VAE feature dim #53

The influence of VAE feature dim #53

Comments

Tom-zgt commented Sep 27, 2024 • edited Loading

LTH14 commented Sep 27, 2024 • edited Loading

Tom-zgt commented Sep 27, 2024 •

edited

Loading

LTH14 commented Sep 27, 2024 •

edited

Loading