-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: Adds Iñigo's w10 and w11 blog posts #63
Conversation
Also adds necessary images
Previously 100dpi, now 600dpi
🪓 PR closed, deleted preview at https://github.com/dipy/preview-html/tree/main/dipy.org/pull/63/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @itellaetxe!
Good work and nice clear articles. LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this @itellaetxe.
A couple of inline comments.
|
||
Let's break down how the AAE works. For those not familiar with how generative adversarial networks (GANs) work, the idea is to have two networks, a generator and a discriminator, that play a game. The generator tries to generate samples that look like the real data (e.g.: pictures of animals), while the discriminator tries to distinguish between real and generated samples. The generator is trained to fool the discriminator, and the discriminator is trained to not be fooled. This way, the generator learns to generate samples that look like the real data. The adversarial loss (:math:`\mathcal{L}_{adv}`) is computed as it is shown in the lowest rectangle. | ||
|
||
In our case, the generator is the encoder :math:`\mathcal{G}`, which generates a latent representation of the input data, which the discriminator :math:`\mathcal{D}` tries to distinguish from "real" latent representations, sampled from a given prior distribution. The trick to introduce the information of the kind of animal to which the photo belongs to is to concatenate the latent representation with the one-hot encoded bundle and attribute vectors. This way, the decoder :math:`\mathcal{D}` can generate samples conditioned on a categorical variable. The reconstruction loss (:math:`\mathcal{L}_{MSE}`) is computed as it is shown in the middle rectangle, and it ensures that the samples are reconstructed from the latent representation as close as possible to the original data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The trick to introduce the information of the kind of animal to which the photo belongs (...) the one-hot encoded bundle (...) : talking about photos, then bundles, previously about pictures of animals. Can we be consistent when choosing the example domain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, true. Will stick to animals to keep it simple.
|
||
In our case, the generator is the encoder :math:`\mathcal{G}`, which generates a latent representation of the input data, which the discriminator :math:`\mathcal{D}` tries to distinguish from "real" latent representations, sampled from a given prior distribution. The trick to introduce the information of the kind of animal to which the photo belongs to is to concatenate the latent representation with the one-hot encoded bundle and attribute vectors. This way, the decoder :math:`\mathcal{D}` can generate samples conditioned on a categorical variable. The reconstruction loss (:math:`\mathcal{L}_{MSE}`) is computed as it is shown in the middle rectangle, and it ensures that the samples are reconstructed from the latent representation as close as possible to the original data. | ||
|
||
As for the AR, we try to tie a continuous attribute of choice found in the data space (fur length, age, size, etc.) to a specific dimension of the latent space. To do this, we compute an attribute-distance matrix in the data space :math:`D_a`, and we compute a distance matrix from the chosen dimension of the latent space (:math:`D_r`). By minimizing the mean absolute error (MAE) between the two matrices, we force the latent space to be organized in such a way that the chosen dimension is related to the chosen attribute. This way, we can generate samples conditioned on the attribute of choice. The AR loss (:math:`\mathcal{L}_{AR}`) is computed as it is shown in the top rectangle. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to the above: (fur length, age, size, etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for addressing the comments. All good, merging
DOC: Adds Iñigo's w10 and w11 blog posts 95ccd31
Adds Iñigo's week 10 and week 11 blog posts.
Also adds necessary images to render properly.
I tried the inline LaTeX capabilities of RST in w11 blog post, should render fine (I believe).