-
Notifications
You must be signed in to change notification settings - Fork 563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training test case question #123
Comments
Hi @jimb2834 , |
Hi @orpatashnik - Yes your work is great Yes, Here is a little background: In the past, I have experimented and created new GAN, SG, SG-ada, SG3 models for various use cases, like medical. Recently I became interested in Clip and how to train the models with a "keyword" or like in diffusion "captions" - So simply put I gathered a training set from FFHQ and labeled them "male" and "female" and want to use your training script to train a StyleCLIP model. Thanks for reading this also |
Hi @jimb2834 , StyleCLIP is a method that employs CLIP and StyleGAN for editing, we didn't fine-tune StyleGAN or changed its architecture. If you are only interested in controlling the gender, a possible solution can be to use one of StyleCLIP's methods (latent mapper or global directions). Then you can sample a random latent code, and shift it towards the target gender either with the global direction or a trained mapper. For both of these methods you don't need labeled data, as CLIP provides the guidance. If you are interested to train your own GAN, maybe you can take a look here: https://github.com/JiauZhang/GigaGAN. This is a new GAN architecture that can be conditioned on text. |
Hi @orpatashnik - Thank you again for the replies; I see, but what baffles me is this and perhaps you could advise where my logic is flawed.
In my eventual use case, we cannot use other(s) terminology aka "text" within our future model so need to train a new one matching our Images with our "text" - This then allows us the future ability to use the text "embedding" to isolate a feature. My original "man/women" example may be confusing as it seems quite trivial and already solved, But my goal is the contrary. I need to train specific images with NEW terminology "text". So cannot seem to find a working example where someone has trained a GAN model with "keywords/text" other than in Diffusion models using BLIP/WD14 etc Does this make sense? |
Two GANs I am aware of that use text as input are:
Both of them train a GAN with a dataset consistent of (text, image) pairs. None of them has official implementation, but you can find some non-official ones. |
Hello @orpatashnik - Great work!
I wanted to perform a test and create a new StyleGan2 model with CLIP. So simply adding "captions" like male, and female to the images, and learn how it works.
Example:
What qty of images do you suggest?
and
If I used only 10,000 is this enough to be able to locate the latent space or get some idea how it works?
Thanks!
The text was updated successfully, but these errors were encountered: