Skip to content

Commit

Permalink
All equations of the proposed metric
Browse files Browse the repository at this point in the history
  • Loading branch information
emmt committed Jun 11, 2024
1 parent 9bee16c commit 4ae5628
Show file tree
Hide file tree
Showing 4 changed files with 209 additions and 218 deletions.
1 change: 1 addition & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ makedocs(
raw"\Params" => raw"\boldsymbol{\xi}",
raw"\Dist" => raw"\mathcal{D}",
raw"\Score" => raw"\mathcal{S}",
raw"\Sign" => raw"\mathrm{sgn}",
raw"\RR" => raw"\mathbb{R}",
raw"\MR" => raw"\mathbf{R}",
raw"\One" => raw"\boldsymbol{1}",
Expand Down
192 changes: 104 additions & 88 deletions docs/src/bc2016.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,125 +14,141 @@ irrelevant differences due to:
in `OI_FLUX` data-block;
* **out-of-field values**: after all geometric transformations, the fields of view of the
reference and reconstructed images may be different, it is assumed that missing pixel
values are equal to zero. The same rationale leads to impose that ``β = 0`` in the
general rules.
values are equal to zero. This rationale leads to impose that ``β = 0`` and ``η = 0``.

In addition, the comparison must take into account that the measurements have limited
angular resolution. Thus the reference image ``\Vy`` is the ground truth image ``\Vz``
convolved with an effective _Point Spread Function_ (PSF) whose _Full Width at Half
Maximum_ (FWHM) chosen to match the interferometric resolution. In the comparison, the
restored images are also convolved with a PSF whose FWHM is tuned to best match the
reference image ``\Vy``.
angular resolution. Thus the reference image ``\Vy_λ`` in spectral channel ``λ`` is the
ground truth image ``\Vz_λ`` convolved with an effective _Point Spread Function_ (PSF)
whose _Full Width at Half Maximum_ (FWHM) is ``\omega_{\mathrm{ref}}`` chosen to match the
interferometric resolution:

Let ``\MR(\rho,\Vt,\omega)`` be the linear operator used to resample an image with a given
magnification ``\rho``, translation ``\Vt``, and blur parameter ``\omega``. The
magnification ``\rho`` is the ratio of the output image pixel size over the input image
pixel size. The translation ``\Vt`` may be specified for the input or output pixel grids
(as is the most convenient). The blur parameter ``\omega`` can be specified as the FWHM of
the PSF introduced to control the effective resolution of the output image.
```math
\Vy_λ = \MR_{\Vtheta^{\mathrm{ref}}_λ}\cdot\Vz_λ
```

Using the resampling operator, the **reference image** is:
with ``\MR_{\Vtheta^{\mathrm{ref}}_λ}`` the linear operator used to resample images for
measuring image distances but with parameters:

``` math
\Vy = \MR(\rho_{\mathrm{ref}},\boldsymbol{0},\omega_{\mathrm{ref}}) \cdot \Vz,
```math
\Vtheta^{\mathrm{ref}}_λ = \{
\rho_{\mathrm{ref}} = 1,
\Vt_{\mathrm{ref}} = \Zero,
\omega_λ = λ/(2\,B_{\mathrm{max}})
\}
```

where ``\Vz`` is the ground truth image. Note that there is no translation between the
ground truth image ``\Vz`` and the reference image ``\Vy``.

``\delta_{\mathrm{ref}} = 3\,\mathrm{mas}/\mathrm{pixel}`` is the pixel size of the image
``z`` and ``\omega_{\mathrm{ref}} \sim \lambda_{\mathrm{min}}/(2\,B_{\mathrm{max}})`` is
the FWHM of the objective resolution.
With ``\rho_{\mathrm{ref}} = 1``, it is assumed that there is no translation between the
ground truth images ``\Vz_λ`` and the reference images ``\Vy_λ``, with
``\rho_{\mathrm{ref}} = 1``, the pixel size (3 mas/pixel) is kept the same, and
``\omega_λ = λ/(2\,B_{\mathrm{max}})`` is the FWHM of the interferometric
resolution at wavelength ``λ`` and maximal (projected) baseline ``B_{\mathrm{max}}``.

The score for a given image ``x`` is the sum of the squared differences between the
``\Gamma``-corrected images:
Since ``\Gamma(η) = \Gamma(0) = 0`` and ``\Gamma(\alpha\,x) = \Gamma(\alpha)\,\Gamma(x)``
(whatever ``x`` and ``\alpha``), the distance between the restored and the reference
images in a given spectral channel can be written as:

``` math
\Score_{\Gamma,p}(\Vx) = \frac{
\sum\limits_{\lambda} \min\limits_{\alpha_\lambda,\Vt_\lambda,\omega_\lambda}
\sum\limits_{j} \left(
\Gamma\bigl(
\alpha_\lambda\,
[\MR(\rho,\Vt_\lambda,\omega_\lambda)\cdot\Vx_\lambda]_{j}
\bigr)
- \Gamma\bigl([\Vy_{\lambda}]_j\bigr)
\right)^p
}{
\sum\limits_{\lambda,j} \Gamma\bigl([\Vy_{\lambda}]_j\bigr)^p
}
\Dist_{\Gamma,p}(\Vx_λ,\Vy_λ) = \min_{\tilde{α}_λ,\Vt_λ}
\sum\limits_{j \in Ω_λ}
\left|
\tilde{\alpha}_λ\,[\tilde{\Vx}_λ]_{j}
- [\tilde{\Vy}_λ]_j
\right|^p
```

where ``\Vx_{\lambda}`` is the restored image in the spectral channel indexed by
``\lambda``, ``j`` is the pixel index and ``\Gamma: \Reals\to\Reals`` is a brightness
correction function to emphasizes the interesting parts of the images. We have chosen:
where ``Ω_λ = |\MR_{\Vtheta_λ}\cdot\Vx_λ| \cup |\Vy_λ|`` is the union of the fields of
view of ``\MR_{\Vtheta_λ}\cdot\Vx_λ`` and ``\Vy_λ``, and with ``\tilde{α}_λ =
\Gamma(\alpha_λ)``,

``` math
\gdef\Sign{\mathrm{sign}} % trick
\Gamma(x) = \Sign(x)\,|x|^\gamma \, ,
\left[\tilde{\Vy}_λ\right]_j = \left\{\begin{array}{ll}
\Gamma\bigl(\bigl[\Vy_λ\bigr]_j\bigr) & \text{if }j \in |\Vy_λ|\\
\Gamma(η) = 0 & \text{else}
\end{array}\right,
```

where ``\Sign(x)`` is the sign of ``x``:
the _brightness corrected extrapolated reference image_, and:

``` math
\Sign(x) = \begin{cases}
-1 & \text{if $x < 0$}\\
+1 & \text{if $x > 0$}\\
\phantom{+}0 & \text{if $x = 0$}\\
\end{cases}
\left[\tilde{\Vx}_λ\right]_j = \left\{\begin{array}{ll}
\Gamma\bigl(\bigl[\MR_{\Vtheta_λ}\cdot\Vx_λ\bigr]_j\bigr)
& \text{if }j \in |\MR_{\Vtheta_λ}\cdot\Vx_λ|\\
\Gamma(η) = 0 & \text{else}
\end{array}\right.
```

The sign function is introduced for generality even though restored images are usually
nonnegative. The denominator is to normalize the score: 0 is the best possible value and 1
is the score for a black image. The lower the score the better.
where

Because ``\Gamma(\alpha\,x) = \Gamma(\alpha)\,\Gamma(x)`` (whatever ``x`` and ``\alpha``),
minimizing with respect to ``\alpha_\lambda`` when ``p = 2`` has a closed form solution
and the score simplifies to:
```math
\Vtheta_λ = \{\rho_λ,\Vt_λ,\omega_λ\}
```

``` math
\Score_{\Gamma,2}(x) = 1 - \frac{
\sum\limits_{\lambda} \max\limits_{\Vt_\lambda,\omega_\lambda}
c_\lambda(\Vt_\lambda,\omega_\lambda)
}{
\sum\limits_{\lambda,j} \Gamma\bigl([\Vy_{\lambda}]_j\bigr)^2
} ,
are the settings for resampling the restored image in spectral channel ``λ``. The
magnification ``\rho_λ`` is computed as the ratio of the known pixel sizes of ``\Vx_λ``
and ``\Vy_λ`` and, usually, does not depend on ``λ``.

For the _2016' Interferometric Imaging Beauty Contest_[^Sanchez2016], ``γ = 0.7`` and ``p
= 2`` were the metric parameters chosen for the brightness correction function. When ``p =
2``, minimizing the distance with respect to ``\tilde{α}_λ`` has the following closed-form
solution:

```math
\tilde{α}_λ = \Gamma(α_λ) = \frac{
\sum_{j \in Ω_λ} [\tilde{\Vx}_λ]_j\,[\tilde{\Vy}_λ]_j
}{
\sum_{j \in Ω_λ}
\left([\tilde{\Vx}_λ]_j\right)^2
}
```

with:
and the distance (for ``p = 2``) simplifies to:

``` math
c_\lambda(\Vt,\omega) = \frac{
\Bigl[
\sum_{j}
\Gamma\bigl([\Vy_{\lambda}]_j\bigr)\,
\Gamma\bigl([\MR(\rho,\Vt,\omega)\cdot\Vx_\lambda]_{j}\bigr)
\Bigr]
```math
\Dist_{\Gamma,2}(\Vx_λ,\Vy_λ) =
\sum_{j \in |\tilde{\Vy}_λ|} \left([\tilde{\Vy}_λ]_j\right)^2
- \max_{\Vt_λ} \frac{
\left(
\sum_{j \in Ω_λ} [\tilde{\Vx}_λ]_{j}\,[\tilde{\Vy}_λ]_j
\right)^2
}{
\sum_{j}
\Gamma\bigl([\MR(\rho,t,\omega)\cdot\Vx_\lambda]_{j}\bigr)^2
} \, ,
\sum_{j \in Ω_λ} \left([\tilde{\Vx}_λ]_{j}\right)^2
}
```

which is a normalized correlation between the ``\Gamma``-corrected images.

and the score is:

To compensate for different pixel sizes, the image ``\Vx`` or ``\Vy`` which has the larger
pixel size is interpolated so that both images have the same (smallest) pixel size.

Separable linear interpolation with a triangle kernel is applied for magnifying and fine
shifting the images.

The criterion is minimized for a translation between each images (for each spectral
channel).

In every spectral channel, the brightness of the restored images is scaled so that the
total flux per channel is the same as in the reference image.
```math
\begin{align}
\Score_{\Gamma,2}(\Vx_λ)
= 1 - \frac{
\Dist_{\Gamma,2}(\Vx_λ,\Vy_λ)
}{
\sum_{j \in |\tilde{\Vy}_λ|} \left([\tilde{\Vy}_λ]_j\right)^2
}
&= \max_{\Vt_λ} \frac{
\left(
\sum_{j \in Ω_λ} [\tilde{\Vx}_λ]_{j}\,[\tilde{\Vy}_λ]_j
\right)^2
}{
\left(\sum_{j \in Ω_λ} \left([\tilde{\Vx}_λ]_{j}\right)^2\right)\,
\left(\sum_{j \in Ω_λ} \left([\tilde{\Vy}_λ]_j\right)^2\right)
}\notag\\
&= \frac{
\max\limits_{\Vt_λ} \left(
\sum_{j \in Ω_λ} [\tilde{\Vx}_λ]_{j}\,[\tilde{\Vy}_λ]_j
\right)^2
}{
\left(\sum_{j \in |\MR_{\Vtheta_λ}\cdot\Vx_λ|} \left([\tilde{\Vx}_λ]_{j}\right)^2\right)\,
\left(\sum_{j \in |\Vy_λ|} \left([\tilde{\Vy}_λ]_j\right)^2\right)
}\notag
\end{align}
```

Because the resolution of the reference image may change (see above) the score is the
ratio between the sum for all spectral channels of the scores between the restored images
and the reference images divided by the sum for all spectral channels of the scores
between a zero image and the reference images.
The score for a multi-spectral image ``\Vx`` is the sum of the scores in all spectral channels:

``` math
\Score_{\Gamma,p}(\Vx) = \sum_λ \Score_{\Gamma,p}(\Vx_λ).
```

[^Sanchez2016]:
> J. Sanchez-Bermudez, É. Thiébaut, K.-H. Hofmann, M. Heininger, D.
Expand Down
60 changes: 46 additions & 14 deletions docs/src/general.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,16 +20,16 @@ reference image, and ``\Vx``, a reconstructed image, is given by:
```math
\begin{align*}
\Dist(\Vx,\Vy) = \min_{\Params} \Bigl\{
&\sum_{i \in |\MR_{\Vtheta}\cdot\Vx| \cap |\Vy|}
d\!\left(α\,(\MR_{\Vtheta}\cdot\Vx)_i + β, y_i\right)
&\sum_{j \in |\MR_{\Vtheta}\cdot\Vx| \cap |\Vy|}
d\!\left(α\,(\MR_{\Vtheta}\cdot\Vx)_j + β, y_j\right)
\notag\\
&+ \sum_{i \in |\MR_{\Vtheta}\cdot\Vx| \backslash
&+ \sum_{j \in |\MR_{\Vtheta}\cdot\Vx| \backslash
(|\MR_{\Vtheta}\cdot\Vx| \cap |\Vy|)}
d\!\left(α\,(\MR_{\Vtheta}\cdot\Vx)_i + β, \eta\right)
d\!\left(α\,(\MR_{\Vtheta}\cdot\Vx)_j + β, \eta\right)
\notag\\
&+ \sum_{i \in |\Vy| \backslash
&+ \sum_{j \in |\Vy| \backslash
(|\MR_{\Vtheta}\cdot\Vx| \cap |\Vy|)}
d\!\left(\eta,y_i\right)
d\!\left(\eta,y_j\right)
\Bigr\}
\end{align*}
```
Expand All @@ -48,22 +48,54 @@ settings while ``\alpha \in \mathbb{R}`` and the translation ``\Vt \in \mathbb{R
be adjusted to reduce the mismatch between the images. Hence, ``\Params = \{\alpha,\Vt\}``
in this context.

The score may be defined by normalizing the distance:
A score may be defined by normalizing the distance and such that the higher the score, the
better the restored image ``\Vx``:

```math
\Score(\Vx) = \frac{\Dist(\Vx,\Vy)}{\Dist(\eta\,\One,\Vy)}
\Score(\Vx) = 1 - \frac{\Dist(\Vx,\Vy)}{\Dist(\eta\,\One,\Vy)}
```

where ``\One`` is an image of the same size as ``\Vy`` but filled with ones, hence
``\eta\,\One`` is an image of the same size as ``\Vy`` but filled with ``\eta`` the
assumed out-of-field pixel value.
``\eta\,\One`` is an image of the same size as ``\Vy`` but filled with ``\eta``, the
assumed out-of-field pixel value. The score may be negative but the maximal score is 1.

The following properties are assumed for the pixel-wise distance:
Denoting by ``\mathbb{K}`` the set of possible pixel values, the following properties must
hold for the pixel-wise distance:

1. ``d(x,x) = 0`` for any ``x \in \mathbb{R}``;
2. ``d(x,y) > 0`` for any ``(x,y) \in \mathbb{R}^2`` such that ``x \not= y``;
3. ``d(y,x) = d(y,x)`` for any ``(x,y) \in \mathbb{R}^2``.
1. ``d(x,x) = 0`` for any ``x \in \mathbb{K}``;
2. ``d(x,y) > 0`` for any ``(x,y) \in \mathbb{K}^2`` such that ``x \not= y``;
3. ``d(y,x) = d(y,x)`` for any ``(x,y) \in \mathbb{K}^2``.

To remain general, a possible pixel-wise distance for which the above properties hold is
given by:

```math
d(x, y) = \left|\Gamma(x) - \Gamma(y)\right|^p
```

where the exponent ``p`` and the function ``\Gamma: \mathbb{K}\to\mathbb{R}`` are
introduced to make the distance more flexible. ``\Gamma`` is a _brightness correction_
monotonic function to emphasize the interesting parts of the images amd ``p > 0`` to have
a non-decreasing distance with respect to the absolute value of the difference
``\Gamma(x) - \Gamma(y)``. For example:

``` math
\Gamma(x) = \Sign(x)\,|x|^\gamma,
```

where ``\Sign(x)`` is the sign of ``x``:

``` math
\Sign(x) = \begin{cases}
-1 & \text{if $x < 0$}\\
+1 & \text{if $x > 0$}\\
\phantom{+}0 & \text{if $x = 0$}\\
\end{cases}
```

Metric parameters ``p`` and ``\gamma`` can be chosen depending on the context. According
to a human panel[^Gomes2016], ``p = 1`` with ``\gamma = 1`` best reflect the human
perception of image quality.

[^Gomes2016]:
> N. Gomes, P. J. V. Garcia & É. Thiébaut, *Assessing the quality of
Expand Down
Loading

0 comments on commit 4ae5628

Please sign in to comment.