FIX: Applied suggested changes for domain consistency

dipy · Aug 14, 2024 · dd1d891 · dd1d891
1 parent 3c3bc32
commit dd1d891
Show file tree

Hide file tree

Showing 2 changed files with 7 additions and 7 deletions.
diff --git a/posts/2024/2024_08_02_Inigo_week_10.rst b/posts/2024/2024_08_02_Inigo_week_10.rst
@@ -23,7 +23,7 @@ We can see that yes, the streamlines are grouped into "worm" clusters. Again, th
 
 Now that we have validated that the model is capturing the attribute variability, we can move on to the next step: generating fibers of specific lengths and checking if they belong to the desired bundle. This is a more intuitive way to validate the model, as we can visually check if the model is generating fibers of the desired length and bundle.
 
-To do this, I generated streamlines spanning the whole length range of the dataset and checked if they belonged to the correct bundle, but the results were not great, because the morphology of the generated streamlines was constantly the same, regardless of the indicated length. To check the results better, I generated 300 streamlines of the minimum and maximum lengths found in the training dataset, but the morphology problem persisted. In the figure below you can see that the shorter streamlines (length = 30 units) are morphologically very similar to the ones in the right (length = 300 units).
+To do this, I generated streamlines spanning the whole length range of the dataset and checked if they belonged to the correct bundle, but the results were not great, because the morphology of the generated streamlines was the same every time, regardless of the indicated length. To check the results better, I generated 300 streamlines of the minimum and maximum lengths found in the training dataset, but the morphology problem persisted. In the figure below you can see that the shorter streamlines (length = 30 units) are morphologically very similar to the ones in the right (length = 300 units).
 
 .. image:: /_static/images/gsoc/2024/inigo/streamlines_short_long.png
     :alt: Bundles of generated streamlines with length 30 (left) and 300 (right).

diff --git a/posts/2024/2024_08_09_Inigo_week_11.rst b/posts/2024/2024_08_09_Inigo_week_11.rst
@@ -11,21 +11,21 @@ What I did this week
 
 This week was all about learning about adversarial networks, `attribute-based latent space regularization in AutoEncoders <https://doi.org/10.1007/s00521-020-05270-2>`_, and fighting with Keras and TensorFlow to implement the adversarial framework. It was a bit (or two) challenging, but I managed to do it, thanks to a `very nice and clean implementation <https://github.com/elsanns/adversarial-autoencoder-tf2/tree/master>`_ I found, based on the original `adversarial AutoEncoders paper <http://arxiv.org/abs/1511.05644>`_.
 
-I still did not implement the attribute-based regularization (AR), but once I train the adversarial AutoEncoder (AAE from now on), visualize its latent space, and check that I can generate samples from specific bundles, I will implement it. Hopefully, all this will go smoothly. For now, I succeeded instantiating the model without any errors, and next week I will train.
+I still did not implement the attribute-based regularization (AR), but once I train the adversarial AutoEncoder (AAE from now on), visualize its latent space, and check that I can generate samples from a specific class (bundles in our case), I will implement it. Hopefully, all this will go smoothly. For now, I succeeded instantiating the model without any errors, and next week I will train it.
 
-Anyways, in the figure below you can see the architecture I proposed for the AAE, which should allow conditioning the data generation process on the bundle and the attribute (streamline length for now, age in the future):
+Anyways, in the figure below you can see the architecture I proposed for the AAE, which should allow conditioning the data generation process on a categorical variable and a continuous attribute:
 
 .. image:: /_static/images/gsoc/2024/inigo/adversarial_ae_with_abr.png
     :alt: Diagram of the architecture proposed to allow conditioning on categoric and continuous variables.
     :width: 600
 
-Let's break down how the AAE works. For those not familiar with how generative adversarial networks (GANs) work, the idea is to have two networks, a generator and a discriminator, that play a game. The generator tries to generate samples that look like the real data (e.g.: pictures of animals), while the discriminator tries to distinguish between real and generated samples. The generator is trained to fool the discriminator, and the discriminator is trained to not be fooled. This way, the generator learns to generate samples that look like the real data. The adversarial loss (:math:`\mathcal{L}_{adv}`) is computed as it is shown in the lowest rectangle.
+Let's break down how the AAE works. For those not familiar with how generative adversarial networks (GANs) work, the idea is to have two networks, a generator and a discriminator, that play a game. The generator tries to generate samples that look like the real data (e.g.: pictures of animals), while the discriminator tries to distinguish between real and generated samples. The generator is trained to fool the discriminator, and the discriminator is trained to not be fooled. This way, the generator learns to generate samples that look like the real data (e.g.: real pictures of animals). The adversarial loss (:math:`\mathcal{L}_{adv}`) is computed as it is shown in the lowest rectangle.
 
-In our case, the generator is the encoder :math:`\mathcal{G}`, which generates a latent representation of the input data, which the discriminator :math:`\mathcal{D}` tries to distinguish from "real" latent representations, sampled from a given prior distribution. The trick to introduce the information of the kind of animal to which the photo belongs to is to concatenate the latent representation with the one-hot encoded bundle and attribute vectors. This way, the decoder :math:`\mathcal{D}` can generate samples conditioned on a categorical variable. The reconstruction loss (:math:`\mathcal{L}_{MSE}`) is computed as it is shown in the middle rectangle, and it ensures that the samples are reconstructed from the latent representation as close as possible to the original data.
+In our case, the generator is the encoder :math:`\mathcal{G}`, which generates a latent representation of the input data, which the discriminator :math:`\mathcal{D}` tries to distinguish from "real" latent representations, sampled from a given prior distribution. The trick to condition the model on a categorical variable (e.g.: the kind of animal to which the photo belongs to) is to concatenate the latent representation generated by the encoder :math:`\mathcal{G}` with the one-hot encoded animal type class. This way, the decoder :math:`\mathcal{D}` can generate samples conditioned on a categorical variable. The reconstruction loss (:math:`\mathcal{L}_{MSE}`) is computed as it is shown in the middle rectangle, and it ensures that the samples are reconstructed from the latent representation as close as possible to the original data.
 
-As for the AR, we try to tie a continuous attribute of choice found in the data space (fur length, age, size, etc.) to a specific dimension of the latent space. To do this, we compute an attribute-distance matrix in the data space :math:`D_a`, and we compute a distance matrix from the chosen dimension of the latent space (:math:`D_r`). By minimizing the mean absolute error (MAE) between the two matrices, we force the latent space to be organized in such a way that the chosen dimension is related to the chosen attribute. This way, we can generate samples conditioned on the attribute of choice. The AR loss (:math:`\mathcal{L}_{AR}`) is computed as it is shown in the top rectangle.
+As for the AR, we try to tie a continuous attribute of choice found in the data space (e.g.:fur length) to a specific dimension of the latent space. To do this, we compute an attribute-distance matrix in the data space :math:`D_a`, and we compute a distance matrix from the chosen dimension of the latent space (:math:`D_r`). By minimizing the mean absolute error (MAE) between the two matrices, we force the latent space to be organized in such a way that the chosen dimension is related to the chosen attribute. This way, we can generate samples conditioned on the attribute of choice, e.g.: we can generate a specific category (cat) with a specific attribute (fur length). The AR loss (:math:`\mathcal{L}_{AR}`) is computed as it is shown in the top rectangle. Coming back to the real domain of our problem, the categorical variable would be the bundle to which the fiber belongs, and the continuous attribute would be the streamline length, and in the future this last one would be the age of the tractogram.
 
-Lastly, I also started writing my last post for the GSoC 2024, which will be a summary of the project, the results, and the future work. I will open a draft PR for continuing my work outside of the coding period because I want to keep working on this project as it is a very interesting topic in line with my PhD research.
+Lastly, I also started writing my last post for GSoC 2024, which will be a summary of the project, the results, and the future work. I will open a draft PR for continuing my work outside of the coding period because I want to keep working on this project as it is a very interesting topic in line with my PhD research.
 
 What is coming up next week
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~