Why put a VAE to encode and decode outside the MAR module? #58

DeepDuke · 2024-10-11T09:34:27Z

Sorry, I'm a newbie to this field. I was confused about since MAR already has a encoder-decoder module why use a VAE encoder to encode the input images into a gaussian noise image, then after the MAR module, use the VAE decoder to decode the MAR's output? Thanks for you kind explaination.

LTH14 · 2024-10-11T13:48:58Z

Using VAE (or in general, AE) to encode the image is a standard practice for current image generative models. See these two papers: https://arxiv.org/abs/2012.09841. https://arxiv.org/pdf/2112.10752

DeepDuke · 2024-10-14T13:41:37Z

Hi, I have a new question: Why self.encoder_pos_embed_learned and self.decoder_pos_embed_learned are set as learnable parameters, and they seem to have no direct association with patch position value? @LTH14

LTH14 · 2024-10-14T13:52:42Z

Using learnable parameters is one way to do positional embedding. We need these two parameters to indicate the patch location for each patch.

DeepDuke · 2024-10-14T14:13:33Z

@LTH14 Yeah. I know you are using them to do positional embedding. But the positional embedding seems to always be the same for every patch in single image? Because there is no patch position-related operation on self.encoder_pos_embed_learned and decoder_pos_embed_learned

LTH14 · 2024-10-14T14:15:00Z

There are seq_len+buffer_size different position embeddings in them.

DeepDuke · 2024-10-14T15:25:48Z

There are seq_len+buffer_size different position embeddings in them.

Oh, I see. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why put a VAE to encode and decode outside the MAR module? #58

Why put a VAE to encode and decode outside the MAR module? #58

DeepDuke commented Oct 11, 2024 •

edited

Loading

LTH14 commented Oct 11, 2024

DeepDuke commented Oct 14, 2024 •

edited

Loading

LTH14 commented Oct 14, 2024

DeepDuke commented Oct 14, 2024

LTH14 commented Oct 14, 2024 •

edited

Loading

DeepDuke commented Oct 14, 2024

Why put a VAE to encode and decode outside the MAR module? #58

Why put a VAE to encode and decode outside the MAR module? #58

Comments

DeepDuke commented Oct 11, 2024 • edited Loading

LTH14 commented Oct 11, 2024

DeepDuke commented Oct 14, 2024 • edited Loading

LTH14 commented Oct 14, 2024

DeepDuke commented Oct 14, 2024

LTH14 commented Oct 14, 2024 • edited Loading

DeepDuke commented Oct 14, 2024

DeepDuke commented Oct 11, 2024 •

edited

Loading

DeepDuke commented Oct 14, 2024 •

edited

Loading

LTH14 commented Oct 14, 2024 •

edited

Loading