All equations of the proposed metric

JMMC-OpenDev · Jun 11, 2024 · 4ae5628 · 4ae5628
1 parent 9bee16c
commit 4ae5628
Show file tree

Hide file tree

Showing 4 changed files with 209 additions and 218 deletions.
diff --git a/docs/make.jl b/docs/make.jl
@@ -29,6 +29,7 @@ makedocs(
                                 raw"\Params" => raw"\boldsymbol{\xi}",
                                 raw"\Dist" => raw"\mathcal{D}",
                                 raw"\Score" => raw"\mathcal{S}",
+                                raw"\Sign" => raw"\mathrm{sgn}",
                                 raw"\RR" => raw"\mathbb{R}",
                                 raw"\MR" => raw"\mathbf{R}",
                                 raw"\One" => raw"\boldsymbol{1}",

diff --git a/docs/src/bc2016.md b/docs/src/bc2016.md
@@ -14,125 +14,141 @@ irrelevant differences due to:
   in `OI_FLUX` data-block;
 * **out-of-field values**: after all geometric transformations, the fields of view of the
   reference and reconstructed images may be different, it is assumed that missing pixel
-  values are equal to zero. The same rationale leads to impose that ``β = 0`` in the
-  general rules.
+  values are equal to zero. This rationale leads to impose that ``β = 0`` and ``η = 0``.
 
 In addition, the comparison must take into account that the measurements have limited
-angular resolution. Thus the reference image ``\Vy`` is the ground truth image ``\Vz``
-convolved with an effective _Point Spread Function_ (PSF) whose _Full Width at Half
-Maximum_ (FWHM) chosen to match the interferometric resolution. In the comparison, the
-restored images are also convolved with a PSF whose FWHM is tuned to best match the
-reference image ``\Vy``.
+angular resolution. Thus the reference image ``\Vy_λ`` in spectral channel ``λ`` is the
+ground truth image ``\Vz_λ`` convolved with an effective _Point Spread Function_ (PSF)
+whose _Full Width at Half Maximum_ (FWHM) is ``\omega_{\mathrm{ref}}`` chosen to match the
+interferometric resolution:
 
-Let ``\MR(\rho,\Vt,\omega)`` be the linear operator used to resample an image with a given
-magnification ``\rho``, translation ``\Vt``, and blur parameter ``\omega``. The
-magnification ``\rho`` is the ratio of the output image pixel size over the input image
-pixel size. The translation ``\Vt`` may be specified for the input or output pixel grids
-(as is the most convenient). The blur parameter ``\omega`` can be specified as the FWHM of
-the PSF introduced to control the effective resolution of the output image.
+```math
+\Vy_λ = \MR_{\Vtheta^{\mathrm{ref}}_λ}\cdot\Vz_λ
+```
 
-Using the resampling operator, the **reference image** is:
+with ``\MR_{\Vtheta^{\mathrm{ref}}_λ}`` the linear operator used to resample images for
+measuring image distances but with parameters:
 
-``` math
-\Vy = \MR(\rho_{\mathrm{ref}},\boldsymbol{0},\omega_{\mathrm{ref}}) \cdot \Vz,
+```math
+\Vtheta^{\mathrm{ref}}_λ = \{
+  \rho_{\mathrm{ref}} = 1,
+  \Vt_{\mathrm{ref}} = \Zero,
+  \omega_λ = λ/(2\,B_{\mathrm{max}})
+\}
 ```
 
-where ``\Vz`` is the ground truth image. Note that there is no translation between the
-ground truth image ``\Vz`` and the reference image ``\Vy``.
-
-``\delta_{\mathrm{ref}} = 3\,\mathrm{mas}/\mathrm{pixel}`` is the pixel size of the image
-``z`` and ``\omega_{\mathrm{ref}} \sim \lambda_{\mathrm{min}}/(2\,B_{\mathrm{max}})`` is
-the FWHM of the objective resolution.
+With ``\rho_{\mathrm{ref}} = 1``, it is assumed that there is no translation between the
+ground truth images ``\Vz_λ`` and the reference images ``\Vy_λ``, with
+``\rho_{\mathrm{ref}} = 1``, the pixel size (3 mas/pixel) is kept the same, and
+``\omega_λ = λ/(2\,B_{\mathrm{max}})`` is the FWHM of the interferometric
+resolution at wavelength ``λ`` and maximal (projected) baseline ``B_{\mathrm{max}}``.
 
-The score for a given image ``x`` is the sum of the squared differences between the
-``\Gamma``-corrected images:
+Since ``\Gamma(η) = \Gamma(0) = 0`` and ``\Gamma(\alpha\,x) = \Gamma(\alpha)\,\Gamma(x)``
+(whatever ``x`` and ``\alpha``), the distance between the restored and the reference
+images in a given spectral channel can be written as:
 
 ``` math
-\Score_{\Gamma,p}(\Vx) = \frac{
-  \sum\limits_{\lambda} \min\limits_{\alpha_\lambda,\Vt_\lambda,\omega_\lambda}
-  \sum\limits_{j} \left(
-      \Gamma\bigl(
-        \alpha_\lambda\,
-        [\MR(\rho,\Vt_\lambda,\omega_\lambda)\cdot\Vx_\lambda]_{j}
-      \bigr)
-      - \Gamma\bigl([\Vy_{\lambda}]_j\bigr)
-     \right)^p
-  }{
-    \sum\limits_{\lambda,j} \Gamma\bigl([\Vy_{\lambda}]_j\bigr)^p
-  }
+\Dist_{\Gamma,p}(\Vx_λ,\Vy_λ) = \min_{\tilde{α}_λ,\Vt_λ}
+  \sum\limits_{j \in Ω_λ}
+  \left|
+    \tilde{\alpha}_λ\,[\tilde{\Vx}_λ]_{j}
+    - [\tilde{\Vy}_λ]_j
+  \right|^p
 ```
 
-where ``\Vx_{\lambda}`` is the restored image in the spectral channel indexed by
-``\lambda``, ``j`` is the pixel index and ``\Gamma: \Reals\to\Reals`` is a brightness
-correction function to emphasizes the interesting parts of the images. We have chosen:
+where ``Ω_λ = |\MR_{\Vtheta_λ}\cdot\Vx_λ| \cup |\Vy_λ|`` is the union of the fields of
+view of ``\MR_{\Vtheta_λ}\cdot\Vx_λ`` and ``\Vy_λ``, and with ``\tilde{α}_λ =
+\Gamma(\alpha_λ)``,
 
 ``` math
-\gdef\Sign{\mathrm{sign}} % trick
-\Gamma(x) = \Sign(x)\,|x|^\gamma \, ,
+\left[\tilde{\Vy}_λ\right]_j = \left\{\begin{array}{ll}
+  \Gamma\bigl(\bigl[\Vy_λ\bigr]_j\bigr) & \text{if }j \in |\Vy_λ|\\
+  \Gamma(η) = 0 & \text{else}
+\end{array}\right,
 ```
 
-where ``\Sign(x)`` is the sign of ``x``:
+the _brightness corrected extrapolated reference image_, and:
 
 ``` math
-\Sign(x) = \begin{cases}
--1 & \text{if $x < 0$}\\
-+1 & \text{if $x > 0$}\\
-\phantom{+}0 & \text{if $x = 0$}\\
-\end{cases}
+\left[\tilde{\Vx}_λ\right]_j = \left\{\begin{array}{ll}
+  \Gamma\bigl(\bigl[\MR_{\Vtheta_λ}\cdot\Vx_λ\bigr]_j\bigr)
+  & \text{if }j \in |\MR_{\Vtheta_λ}\cdot\Vx_λ|\\
+  \Gamma(η) = 0 & \text{else}
+\end{array}\right.
 ```
 
-The sign function is introduced for generality even though restored images are usually
-nonnegative. The denominator is to normalize the score: 0 is the best possible value and 1
-is the score for a black image. The lower the score the better.
+where
 
-Because ``\Gamma(\alpha\,x) = \Gamma(\alpha)\,\Gamma(x)`` (whatever ``x`` and ``\alpha``),
-minimizing with respect to ``\alpha_\lambda`` when ``p = 2`` has a closed form solution
-and the score simplifies to:
+```math
+\Vtheta_λ = \{\rho_λ,\Vt_λ,\omega_λ\}
+```
 
-``` math
-\Score_{\Gamma,2}(x) = 1 - \frac{
-    \sum\limits_{\lambda} \max\limits_{\Vt_\lambda,\omega_\lambda}
-    c_\lambda(\Vt_\lambda,\omega_\lambda)
-  }{
-    \sum\limits_{\lambda,j} \Gamma\bigl([\Vy_{\lambda}]_j\bigr)^2
-  } ,
+are the settings for resampling the restored image in spectral channel ``λ``. The
+magnification ``\rho_λ`` is computed as the ratio of the known pixel sizes of ``\Vx_λ``
+and ``\Vy_λ`` and, usually, does not depend on ``λ``.
+
+For the _2016' Interferometric Imaging Beauty Contest_[^Sanchez2016], ``γ = 0.7`` and ``p
+= 2`` were the metric parameters chosen for the brightness correction function. When ``p =
+2``, minimizing the distance with respect to ``\tilde{α}_λ`` has the following closed-form
+solution:
+
+```math
+\tilde{α}_λ = \Gamma(α_λ) = \frac{
+  \sum_{j \in Ω_λ} [\tilde{\Vx}_λ]_j\,[\tilde{\Vy}_λ]_j
+}{
+  \sum_{j \in Ω_λ}
+  \left([\tilde{\Vx}_λ]_j\right)^2
+}
 ```
 
-with:
+and the distance (for ``p = 2``) simplifies to:
 
-``` math
-  c_\lambda(\Vt,\omega) = \frac{
-    \Bigl[
-      \sum_{j}
-      \Gamma\bigl([\Vy_{\lambda}]_j\bigr)\,
-      \Gamma\bigl([\MR(\rho,\Vt,\omega)\cdot\Vx_\lambda]_{j}\bigr)
-    \Bigr]
+```math
+\Dist_{\Gamma,2}(\Vx_λ,\Vy_λ) =
+  \sum_{j \in |\tilde{\Vy}_λ|} \left([\tilde{\Vy}_λ]_j\right)^2
+  - \max_{\Vt_λ} \frac{
+    \left(
+        \sum_{j \in Ω_λ} [\tilde{\Vx}_λ]_{j}\,[\tilde{\Vy}_λ]_j
+    \right)^2
   }{
-    \sum_{j}
-    \Gamma\bigl([\MR(\rho,t,\omega)\cdot\Vx_\lambda]_{j}\bigr)^2
-  } \, ,
+    \sum_{j \in Ω_λ} \left([\tilde{\Vx}_λ]_{j}\right)^2
+  }
 ```
 
-which is a normalized correlation between the ``\Gamma``-corrected images.
-
+and the score is:
 
-To compensate for different pixel sizes, the image ``\Vx`` or ``\Vy`` which has the larger
-pixel size is interpolated so that both images have the same (smallest) pixel size.
-
-Separable linear interpolation with a triangle kernel is applied for magnifying and fine
-shifting the images.
-
-The criterion is minimized for a translation between each images (for each spectral
-channel).
-
-In every spectral channel, the brightness of the restored images is scaled so that the
-total flux per channel is the same as in the reference image.
+```math
+\begin{align}
+\Score_{\Gamma,2}(\Vx_λ)
+  = 1 - \frac{
+    \Dist_{\Gamma,2}(\Vx_λ,\Vy_λ)
+  }{
+    \sum_{j \in |\tilde{\Vy}_λ|} \left([\tilde{\Vy}_λ]_j\right)^2
+  }
+  &= \max_{\Vt_λ} \frac{
+    \left(
+        \sum_{j \in Ω_λ} [\tilde{\Vx}_λ]_{j}\,[\tilde{\Vy}_λ]_j
+    \right)^2
+  }{
+    \left(\sum_{j \in Ω_λ} \left([\tilde{\Vx}_λ]_{j}\right)^2\right)\,
+    \left(\sum_{j \in Ω_λ} \left([\tilde{\Vy}_λ]_j\right)^2\right)
+  }\notag\\
+  &= \frac{
+    \max\limits_{\Vt_λ} \left(
+        \sum_{j \in Ω_λ} [\tilde{\Vx}_λ]_{j}\,[\tilde{\Vy}_λ]_j
+    \right)^2
+  }{
+    \left(\sum_{j \in |\MR_{\Vtheta_λ}\cdot\Vx_λ|} \left([\tilde{\Vx}_λ]_{j}\right)^2\right)\,
+    \left(\sum_{j \in |\Vy_λ|} \left([\tilde{\Vy}_λ]_j\right)^2\right)
+  }\notag
+\end{align}
+```
 
-Because the resolution of the reference image may change (see above) the score is the
-ratio between the sum for all spectral channels of the scores between the restored images
-and the reference images divided by the sum for all spectral channels of the scores
-between a zero image and the reference images.
+The score for a multi-spectral image ``\Vx`` is the sum of the scores in all spectral channels:
 
+``` math
+\Score_{\Gamma,p}(\Vx) = \sum_λ \Score_{\Gamma,p}(\Vx_λ).
+```
 
 [^Sanchez2016]:
     > J. Sanchez-Bermudez, É. Thiébaut, K.-H. Hofmann, M. Heininger, D.

diff --git a/docs/src/general.md b/docs/src/general.md
@@ -20,16 +20,16 @@ reference image, and ``\Vx``, a reconstructed image, is given by:
 ```math
 \begin{align*}
 \Dist(\Vx,\Vy) = \min_{\Params} \Bigl\{
-  &\sum_{i \in |\MR_{\Vtheta}\cdot\Vx| \cap |\Vy|}
-  d\!\left(α\,(\MR_{\Vtheta}\cdot\Vx)_i + β, y_i\right)
+  &\sum_{j \in |\MR_{\Vtheta}\cdot\Vx| \cap |\Vy|}
+  d\!\left(α\,(\MR_{\Vtheta}\cdot\Vx)_j + β, y_j\right)
   \notag\\
-  &+ \sum_{i \in |\MR_{\Vtheta}\cdot\Vx| \backslash
+  &+ \sum_{j \in |\MR_{\Vtheta}\cdot\Vx| \backslash
   (|\MR_{\Vtheta}\cdot\Vx| \cap |\Vy|)}
-  d\!\left(α\,(\MR_{\Vtheta}\cdot\Vx)_i + β, \eta\right)
+  d\!\left(α\,(\MR_{\Vtheta}\cdot\Vx)_j + β, \eta\right)
   \notag\\
-  &+ \sum_{i \in |\Vy| \backslash
+  &+ \sum_{j \in |\Vy| \backslash
   (|\MR_{\Vtheta}\cdot\Vx| \cap |\Vy|)}
-  d\!\left(\eta,y_i\right)
+  d\!\left(\eta,y_j\right)
 \Bigr\}
 \end{align*}
 ```
@@ -48,22 +48,54 @@ settings while ``\alpha \in \mathbb{R}`` and the translation ``\Vt \in \mathbb{R
 be adjusted to reduce the mismatch between the images. Hence, ``\Params = \{\alpha,\Vt\}``
 in this context.
 
-The score may be defined by normalizing the distance:
+A score may be defined by normalizing the distance and such that the higher the score, the
+better the restored image ``\Vx``:
 
 ```math
-\Score(\Vx) = \frac{\Dist(\Vx,\Vy)}{\Dist(\eta\,\One,\Vy)}
+\Score(\Vx) = 1 - \frac{\Dist(\Vx,\Vy)}{\Dist(\eta\,\One,\Vy)}
 ```
 
 where ``\One`` is an image of the same size as ``\Vy`` but filled with ones, hence
-``\eta\,\One`` is an image of the same size as ``\Vy`` but filled with ``\eta`` the
-assumed out-of-field pixel value.
+``\eta\,\One`` is an image of the same size as ``\Vy`` but filled with ``\eta``, the
+assumed out-of-field pixel value. The score may be negative but the maximal score is 1.
 
-The following properties are assumed for the pixel-wise distance:
+Denoting by ``\mathbb{K}`` the set of possible pixel values, the following properties must
+hold for the pixel-wise distance:
 
-1. ``d(x,x) = 0`` for any ``x \in \mathbb{R}``;
-2. ``d(x,y) > 0`` for any ``(x,y) \in \mathbb{R}^2`` such that ``x \not= y``;
-3. ``d(y,x) = d(y,x)`` for any ``(x,y) \in \mathbb{R}^2``.
+1. ``d(x,x) = 0`` for any ``x \in \mathbb{K}``;
+2. ``d(x,y) > 0`` for any ``(x,y) \in \mathbb{K}^2`` such that ``x \not= y``;
+3. ``d(y,x) = d(y,x)`` for any ``(x,y) \in \mathbb{K}^2``.
 
+To remain general, a possible pixel-wise distance for which the above properties hold is
+given by:
+
+```math
+d(x, y) = \left|\Gamma(x) - \Gamma(y)\right|^p
+```
+
+where the exponent ``p`` and the function ``\Gamma: \mathbb{K}\to\mathbb{R}`` are
+introduced to make the distance more flexible. ``\Gamma`` is a _brightness correction_
+monotonic function to emphasize the interesting parts of the images amd ``p > 0`` to have
+a non-decreasing distance with respect to the absolute value of the difference
+``\Gamma(x) - \Gamma(y)``. For example:
+
+``` math
+\Gamma(x) = \Sign(x)\,|x|^\gamma,
+```
+
+where ``\Sign(x)`` is the sign of ``x``:
+
+``` math
+\Sign(x) = \begin{cases}
+-1 & \text{if $x < 0$}\\
++1 & \text{if $x > 0$}\\
+\phantom{+}0 & \text{if $x = 0$}\\
+\end{cases}
+```
+
+Metric parameters ``p`` and ``\gamma`` can be chosen depending on the context. According
+to a human panel[^Gomes2016], ``p = 1`` with ``\gamma = 1`` best reflect the human
+perception of image quality.
 
 [^Gomes2016]:
     > N. Gomes, P. J. V. Garcia & É. Thiébaut, *Assessing the quality of