Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Adds Iñigo's w10 and w11 blog posts #63

Merged
merged 4 commits into from
Aug 14, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
65 changes: 65 additions & 0 deletions posts/2024/2024_08_02_Inigo_week_10.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
Week 10 into GSoC 2024: Validating the conditional VAE results
==============================================================

.. post:: August 02 2024
:author: Iñigo Tellaetxe
:tags: google
:category: gsoc

What I did this week
~~~~~~~~~~~~~~~~~~~~

During this week I focused on validating the results of the conditional VAE (cVAE) that I implemented and experimented with last week.

I further trained the model and got a result with lower label loss, indicating that theoretically the model was capturing the attribute variability (streamline length, in this case) in the training data. Our model was able to "understand" how to pick up the length of the streamlines only from their morphology.

As I mentioned last week, plotting the predicted length against the true length is a good way to validate that the model is indeed doing this correctly. In addition, I plotted the 2-dimensional t-SNE projection of latent space of the model. Knowing that each bundle of the FiberCup dataset has a characteristic and rather constant streamline length, the bundles should be clustered together. See the figure below:

.. image:: /_static/images/gsoc/2024/inigo/vae_conditioning_validation.png
:alt: (Left) Predicted length vs True length of streamlines in the training data of the cVAE; (Right) Latent space t-SNE projection of the plausible fibers in the training dataset of the cVAE.
:width: 600

We can see that yes, the streamlines are grouped into "worm" clusters. Again, there are 7 bundles and 7 worms, and each worm has a constant length, indicating that yes, each bundle is represented by a worm.

Now that we have validated that the model is capturing the attribute variability, we can move on to the next step: generating fibers of specific lengths and checking if they belong to the desired bundle. This is a more intuitive way to validate the model, as we can visually check if the model is generating fibers of the desired length and bundle.

To do this, I generated streamlines spanning the whole length range of the dataset and checked if they belonged to the correct bundle, but the results were not great, because the morphology of the generated streamlines was constantly the same, regardless of the indicated length. To check the results better, I generated 300 streamlines of the minimum and maximum lengths found in the training dataset, but the morphology problem persisted. In the figure below you can see that the shorter streamlines (length = 30 units) are morphologically very similar to the ones in the right (length = 300 units).
itellaetxe marked this conversation as resolved.
Show resolved Hide resolved

.. image:: /_static/images/gsoc/2024/inigo/streamlines_short_long.png
:alt: Bundles of generated streamlines with length 30 (left) and 300 (right).
:width: 600

In our weekly meeting we discussed this problem and we argued that this could be due to the architecture of the model, which is trying to predict a whole latent vector ``z`` from a single number ``r``, the attribute to be conditioned on. Find below a diagram of the model architecture for clarification:

.. image:: /_static/images/gsoc/2024/inigo/conditional_vae_architecture_diagram.png
:alt: Architecture of the implemented conditional VAE.
:width: 600

As you see in the diagram, the ``D3`` and ``D4`` blocks of the model are trying to predict the attribute prior represented by the ``r`` variable. The label loss is defined as the mean squared error between this ``r`` and the true attribute of the data (streamline length in this case), so when ``D3`` and ``D4`` pick up the attribute, the model should be able to generate streamlines of the desired length, which is what actually happens.

However, the generator (yellow block) tries to generate two parameters that are compared to the ones in the green block. This means that when we try to generate a streamline of a specific length by running a specific ``r`` value through the generator and then through the decoder, the model is only able to generate a fixed morphology, regardless of the length. This is because the generator is trying to generate a whole latent vector ``z`` from a single number ``r``.

After a thorough discussion, we decided to try a non-variational adversarial framework to solve this problem due to the following reasons:

- The adversarial nature of the architecture implicitly introduces a prior to the data, so regularization with variational methods is not necessary, so the architecture and the loss computation of the model is simpler.

- It is easier to understand, because the original authors of the implemented conditional VAE did not provide a clear derivation of the loss function, so my understanding of its underlying mechanisms is not as deep as I would need to tune its behavior effectively. All in all, the adversarial framework is way more intuitive (at least for me).

- It is widespread and there are many resources out there to understand it and implement it. What is more, I quickly found several implementations of adversarial AutoEncoders in TensorFlow with a basic search in Google. I need to read through them and filter which one is the best for me.

- For sure there are ways to condition the network on categorical and continuous variables, which would be really convenient to condition both on the bundle and the attribute of the data. Currently it is not possible with the cVAE implementation, as it only conditions on the attribute. This would provide greater control when sampling from the latent space.

What is coming up next week
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Having said this, this week I will focus on learning about adversarial AutoEncoders and implementing one. I will also investigate how to introduce the conditioning on the adversarial framework, and how to condition on both categorical and continuous variables.

For now I found that the continuous attribute can be conditioned `like in this work <https://doi.org/10.1007/s00521-020-05270-2>`_ that I mentioned in my GSoC application, and the categorical one like in the `original adversarial AutoEncoders paper <http://arxiv.org/abs/1511.05644>`_.


Did I get stuck anywhere
~~~~~~~~~~~~~~~~~~~~~~~~

As I said in a past blog post, research work may not be as straightforward as one would like, but I also would not say that I got stuck. I am just facing some challenges and this is a journey to complete surely-but-steady, because science does not need "fast food", solutions, but "slow cooked", well-thought, and well-tested ones.

Thank you for reading, until next week!
44 changes: 44 additions & 0 deletions posts/2024/2024_08_09_Inigo_week_11.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
Week 11 into GSoC 2024: The Adversarial AutoEncoder
===================================================

.. post:: August 09 2024
:author: Iñigo Tellaetxe
:tags: google
:category: gsoc

What I did this week
~~~~~~~~~~~~~~~~~~~~

This week was all about learning about adversarial networks, `attribute-based latent space regularization in AutoEncoders <https://doi.org/10.1007/s00521-020-05270-2>`_, and fighting with Keras and TensorFlow to implement the adversarial framework. It was a bit (or two) challenging, but I managed to do it, thanks to a `very nice and clean implementation <https://github.com/elsanns/adversarial-autoencoder-tf2/tree/master>`_ I found, based on the original `adversarial AutoEncoders paper <http://arxiv.org/abs/1511.05644>`_.

I still did not implement the attribute-based regularization (AR), but once I train the adversarial AutoEncoder (AAE from now on), visualize its latent space, and check that I can generate samples from specific bundles, I will implement it. Hopefully, all this will go smoothly. For now, I succeeded instantiating the model without any errors, and next week I will train.

Anyways, in the figure below you can see the architecture I proposed for the AAE, which should allow conditioning the data generation process on the bundle and the attribute (streamline length for now, age in the future):

.. image:: /_static/images/gsoc/2024/inigo/adversarial_ae_with_abr.png
:alt: Diagram of the architecture proposed to allow conditioning on categoric and continuous variables.
:width: 600

Let's break down how the AAE works. For those not familiar with how generative adversarial networks (GANs) work, the idea is to have two networks, a generator and a discriminator, that play a game. The generator tries to generate samples that look like the real data (e.g.: pictures of animals), while the discriminator tries to distinguish between real and generated samples. The generator is trained to fool the discriminator, and the discriminator is trained to not be fooled. This way, the generator learns to generate samples that look like the real data. The adversarial loss (:math:`\mathcal{L}_{adv}`) is computed as it is shown in the lowest rectangle.

In our case, the generator is the encoder :math:`\mathcal{G}`, which generates a latent representation of the input data, which the discriminator :math:`\mathcal{D}` tries to distinguish from "real" latent representations, sampled from a given prior distribution. The trick to introduce the information of the kind of animal to which the photo belongs to is to concatenate the latent representation with the one-hot encoded bundle and attribute vectors. This way, the decoder :math:`\mathcal{D}` can generate samples conditioned on a categorical variable. The reconstruction loss (:math:`\mathcal{L}_{MSE}`) is computed as it is shown in the middle rectangle, and it ensures that the samples are reconstructed from the latent representation as close as possible to the original data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trick to introduce the information of the kind of animal to which the photo belongs (...) the one-hot encoded bundle (...) : talking about photos, then bundles, previously about pictures of animals. Can we be consistent when choosing the example domain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, true. Will stick to animals to keep it simple.


As for the AR, we try to tie a continuous attribute of choice found in the data space (fur length, age, size, etc.) to a specific dimension of the latent space. To do this, we compute an attribute-distance matrix in the data space :math:`D_a`, and we compute a distance matrix from the chosen dimension of the latent space (:math:`D_r`). By minimizing the mean absolute error (MAE) between the two matrices, we force the latent space to be organized in such a way that the chosen dimension is related to the chosen attribute. This way, we can generate samples conditioned on the attribute of choice. The AR loss (:math:`\mathcal{L}_{AR}`) is computed as it is shown in the top rectangle.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the above: (fur length, age, size, etc.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks :)


Lastly, I also started writing my last post for the GSoC 2024, which will be a summary of the project, the results, and the future work. I will open a draft PR for continuing my work outside of the coding period because I want to keep working on this project as it is a very interesting topic in line with my PhD research.

What is coming up next week
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Next week I will:

- Train the AAE. I will probably need to do this in a computing cluster, as my mighty laptop is not powerful enough to train the model in a reasonable time.
- Continue writing the final GSoC2024 post.
- Open the draft PR to include in the final post and have a tangible place to publish my work as a PR.

Did I get stuck anywhere
~~~~~~~~~~~~~~~~~~~~~~~~

This week I fought a lot with Keras and TensorFlow but as I had gained experience from previous "fights" I managed to not get really stuck, so I am happy to say that I also won this time!

Until next week!