Vqgan training #52

isamu-isozaki · 2023-04-21T01:10:44Z

This is a draft pr for adding the vqgan training. It's still quite rough around the edges but might be able to do ok after some bug fixes.

…_training

isamu-isozaki · 2023-04-30T22:31:09Z

Tested out on random noise and it runs. I'll try adapting to webdataset on some clusters and see how it does!

isamu-isozaki · 2023-05-05T19:08:39Z

I found https://arxiv.org/abs/2212.03185 thanks to Laion(Ryu) which improves on movq.
The main ideas are

Add in perceptual loss from lower layers(which we are already doing)
entropy maximization so the codebook usage is 100%

isamu-isozaki · 2023-05-08T11:58:59Z

I'm starting to add the projected gan technique from here. This seems to still have state-of-the-art in quite a few datasets although it is from 2021. The main idea is instead of plugging in images to the generator/discriminator, plugging in timm computed hierarchical features which makes training converge faster.

isamu-isozaki · 2023-05-08T12:00:02Z

Other news is I was finally able to add the imagenet training dataset to the cluster so I will be testing the movq/spectral norm added f16 pre-trained model soon

isamu-isozaki · 2023-10-10T14:35:24Z

I'll add Finite Scalar Quantization: VQ-VAE Made Simple since that seems very interesting. It seems to lead to Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation which has a better fid than diffusion models seems like

isamu-isozaki added 10 commits April 20, 2023 21:00

Added basic idea

90aaf6d

Adding basic idea

59efb88

First idea for training loop

cc2bc90

Added perliminary generation

d6067a4

Merge branch 'main' of https://github.com/huggingface/muse into vqgan…

bbfd757

…_training

Added ema

508a00e

Making config and removed wandb

f6dd2a1

Removed folder

08b2f12

Fixing configs

fa4dc0d

Finished basic vqgan testing

cfd0c19

isamu-isozaki added 4 commits May 7, 2023 21:28

Removed folders

4615048

Removed config

026305a

Adding discriminator warmup

290f287

Starting adding projected gan tech

9997d82

isamu-isozaki added 12 commits May 8, 2023 18:29

Updated config

eb86603

Update docs

02e5c74

Adding slurm file

a597b1a

Updated config

105aac7

Moving tqdm to batch

6872bba

Tried updating webdb

38346ed

Fixed config

4f296fc

Fixing fmap

75f8631

Fixed zero grad

e29acf3

Fixed discriminator training

7994f4e

Increased batch size

c3e6139

Changed mixed precision

277edf1

isamu-isozaki added 16 commits May 15, 2023 17:41

Fixed oom

a7035ab

Fixed debugging

2717aa9

sanity check

d6cb053

sanity check

74eb612

Fixed training

e800cae

sanity check

ca9535e

Fixing logs

8e21e99

Fixing logs

3588307

Fixing logs

d88c402

Fixing logs

61d0d69

Added spectral norm f16 config

c71255a

Fixing logs

851f583

saving discriminator too

1bb0a65

Fixing logs

2f81055

Made training distributed

9d9b227

Properly running vqgan training

c23ae1c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vqgan training #52

Vqgan training #52

isamu-isozaki commented Apr 21, 2023

isamu-isozaki commented Apr 30, 2023

isamu-isozaki commented May 5, 2023

isamu-isozaki commented May 8, 2023

isamu-isozaki commented May 8, 2023 •

edited

Loading

isamu-isozaki commented Oct 10, 2023

Vqgan training #52

Are you sure you want to change the base?

Vqgan training #52

Conversation

isamu-isozaki commented Apr 21, 2023

isamu-isozaki commented Apr 30, 2023

isamu-isozaki commented May 5, 2023

isamu-isozaki commented May 8, 2023

isamu-isozaki commented May 8, 2023 • edited Loading

isamu-isozaki commented Oct 10, 2023

isamu-isozaki commented May 8, 2023 •

edited

Loading