Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What training setup did you use? #12

Open
rom1504 opened this issue Dec 17, 2021 · 1 comment
Open

What training setup did you use? #12

rom1504 opened this issue Dec 17, 2021 · 1 comment

Comments

@rom1504
Copy link

rom1504 commented Dec 17, 2021

This looks great!

Could you share some information on what setup you used for the training of the transformer model?

  • how many gpu / for how long
  • how many steps
  • what batch size

It would be helpful to have these information to better understand the cost of training dalle models.

@afiaka87
Copy link

afiaka87 commented Dec 17, 2021

Here is mild commentary on this in #6

Hello @SeungyounShin, thanks for your test on the zero-shot image-to-image translation.

As you mention, an autoregressive text-to-image generation model can conduct unseen tasks in the zero-shot manner, even though the training dataset does not include the exactly same types of text-image pairs ! However, the capability on the zero-shot learning increases when the model size & dataset set increase together. Please note that the released minDALL-E is yet smaller scale model (1.3B params, 14M text-image pairs) than the original implementation of OpenAI (12B params, 250M text-image pairs).

The problem will be solved when a larger scale of model is trained on a larger number of training samples, and we will also release large scale of models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants