Extend description #54

esgomezm · 2020-11-27T18:34:46Z

Main changes:

Name of the model.yaml according to Is it necessary to have the model name in <model name>.model.yaml? #41.
output shape description as discussed at Real size of the output (definition of scale and offset) #22 and Is null necessary in input / output shape step? #14.

Update repository

@Fynn

Suggested by @Fynn at #43 (comment)

constantinpape

Thanks for working on this @esgomezm.
Changes look mostly good, we should just clarify the definition of percentile.

README.md

constantinpape · 2020-11-29T13:36:07Z

README.md

@@ -133,13 +143,13 @@ For example:

 - `weights` The weights for this model. Weights can be given for different formats, but should otherwise be equivalent.
   - `weight_format` Format of this set of weights. Weight formats can define additional (optional or required) fields. See [supported_formats_and_operations.md#Weight Format](https://github.com/bioimage-io/configuration/blob/master/supported_formats_and_operations.md#weight_format)
-        - `[authors]` a list of authors.
+        - `[authors]` a list of authors. It refers to the ones who trained the model.


I think It refers to the ones who trained the model is not quite correct here, because the optional authors field here indicates to who create this set of weights, which could also happened by converting the weights from one format to another (so not by training the model).

Sorry, I thought those authors were the ones in the main description as it says
The authors are the creators of the specifications and the primary points of contact.. I would expect that the ones who created the specification are also the ones responsible for the weights format.
Maybe we could agree on a general and more specific description?

So the idea is that authors in the main one are the ones that created the model and uploaded the model.yaml etc.
After that happens someone else might convert the weights to a different format and then add this to the same model.yaml. In this case, they should be listed here (because they are different from the "main" authors).

(I agree, we should explain this in more detail)

README.md

constantinpape · 2020-11-29T13:40:08Z

supported_formats_and_operations.md

-    - `min_percentile` the lower percentile used for normalization, in range 0 to 100.
-    - `max_percentile` the upper percentile used for normalization, in range 0 to 100. Has to be bigger than `upper_percentile`.
+    - `min_percentile` the lower percentile used for normalization, in range 0 to 1
+    - `max_percentile` the upper percentile used for normalization, in range 0 to 1. Has to be bigger than `upper_percentile`.


percentile usually refers to a range from 0 to 100, not 0 to 1.
See https://en.wikipedia.org/wiki/Percentile.
I think it would be confusing if we change this from the common definition.
@frauzufall how do you handle this in CSBDeep?

Yes, this was a suggestion coming from #43. On one side I agree that percentiles are quantiles, but on the other, it does really depend on the software you use, and as @FynnBe suggested, some cases can be ambiguous.

No, @FynnBe did not suggest to change the value range to [0, 1]. Instead he suggested to enforce max_percentile > 1 in order to avoid confusion about the value ranges.

FynnBe · 2020-11-30T08:02:41Z

README.md

@@ -62,12 +62,22 @@ Must be a list of *tensor specification keys*.
  *tensor specification keys*:
  - `name` tensor name
  - `data_type` data type (e.g. float32)
-  - `data_range` tuple of (minimum, maximum)
-  - `axes` string of axes identifying characters from: btczyx
+  - `data_range` tuple of (minimum, maximum) (e.g. [-inf, inf])


Just a hunch: Did you remove the inf example, because you expect data_range to hold the actual minimum, maximum of the tensor?
I think for now it was intended as the threoretical data_range (that might be given by the data_type), which the example clarifies. (a comment might be clearer, though). In combination with the data type this is of limited use, but the actual (minimum,maximum) would confusing due to the batch dimension, which is arbitrary and thus has a random influence on these values. The actual, global data range might not be required, thus unnecessary to be computed, or not sufficient (e.g. for percentile normalization). I suggest not to change anything further for 0.3.0, and keep it as theoretical data range.

I think putting in the "data range so that the model yields sensible results" makes sense for the data range:
#55 (comment)

FynnBe · 2020-11-30T08:35:29Z

README.md

@@ -12,7 +12,7 @@ To get a quick overview of the config file, see an example file [here](https://g

 ## Model Specification

-A model entry in the bioimage.io model zoo is defined by a configuration file `<model name>.model.yaml`.
+A model entry in the bioimage.io model zoo is defined by a configuration file `model.yaml`.


Suggested change

A model entry in the bioimage.io model zoo is defined by a configuration file `model.yaml`.

A model entry in the bioimage.io model zoo is defined by a configuration file `<optional model name>.model.yaml`. Once a model is packaged to a `<(mandatory) model name>.model.bioimage.io.zip` file the configuration file inside the .zip is only named `model.yaml`.

The model entry is not the same as the packaged model (where we agreed to model.yaml inside a <model name>.model.bioimage.io.zip, see #41). Keeping several models in one folder should be possible, as the may share other resources, etc.

On second thought, I am not sure we should mention this here, it's difficult to understand this without explaining the ...model.zip (which we are not doing yet, but need to add here as well.)

README.md

FynnBe · 2020-11-30T08:56:00Z

README.md

+    - `reference_input` *input tensor name*
+    - `scale` *list of factors 'output_pix/input_pix' for each dimension*
+    - `offset` *position of origin wrt to input*
+    - `halo` *the halo to crop from the output tensor* (for example to crop away boundary effects or for tiling) (e.g. the valid domain of an output of `size = [256, 256]` and `halo = [32, 32]`, results in `output_shape = [256 - 2 * 32, 256 - 2 * 32] = [192, 192]`).\


In my perception the halo border is actually not cropped away, but returned. However, your description of halo makes more sense. We should never return anything invalid. I saw a problem with 'same' convolutional models, that do return the same output size but have a certain halo of (more or less) invalid border data. I suppose those models should incorporate a postprocessing crop and only return valid data, turning the same convolution model into a valid one. @constantinpape ,@m-novikov, what do you think?

In my perception the halo border is actually not cropped away, but returned.

But what's the difference then of having a halo or not?

The way I understand halo here is that it's a hint to the consumer runing prediction with this model, that indicates which "outer region" of the prediction should be discarded, for example due to boundary artifacts caused by same convolutions. (but there could be lots of other reasons for doing this)

But what's the difference then of having a halo or not?

I am confused, as to me you answer this question yourself:

The way I understand halo here is that it's a hint to the consumer running prediction with this model, that indicates which "outer region" of the prediction should be discarded, for example due to boundary artifacts caused by same convolutions. (but there could be lots of other reasons for doing this)

This is how I understood it, too. I have to say though, that I like @esgomezm definition more that forces you to crop the outputs that contain a halo (either through valid convs or postprocessing), to instead return only valid output. Every consumer software should just tile accordingly and would no longer have to deal with the two options of 1. I ignore the halo and look at ugly border artefacts or 2. I crop the halo from what I got back from the runner. It would amount to option 3: I have to tile correctly (overlapping if model has halo), but what I get back is all valid :-)

Ah ok, so now I understand the issue here, I think.
I think we all agree on the principle definition of halo, but the question is who's responsible for cropping it.

It's cropped in the mode already.

It's cropped by the consumer.

I don't think option 1 makes so much sense, because then we wouldn't need to specify the halo in the first place, because we can just use the predictions from the model in the first place.

But if we go with option 2, this implies that we don't change the output shape when a halo is specified.

I don't think option 1 makes so much sense, because then we wouldn't need to specify the halo in the first place, because we can just use the predictions from the model in the first place.

right, if we do want to handle it this way, we could just move it into output shape, as in this case it's only applicable when calculating the actual output from a reference input

Co-authored-by: FynnBe <[email protected]>

FynnBe · 2020-11-30T09:00:35Z

README.md

+    - `scale` *list of factors 'output_pix/input_pix' for each dimension*
+    - `offset` *position of origin wrt to input*
+    - `halo` *the halo to crop from the output tensor* (for example to crop away boundary effects or for tiling) (e.g. the valid domain of an output of `size = [256, 256]` and `halo = [32, 32]`, results in `output_shape = [256 - 2 * 32, 256 - 2 * 32] = [192, 192]`).\
+    The output shape is determined as `output_shape = reference_input_shape * scale + offset`.


Suggested change

The output shape is determined as `output_shape = reference_input_shape * scale + offset`.

The output shape is determined as `output_shape = reference_input_shape * scale - 2*halo.

in the description of offset it says it describes the relative position to the reference input, slighly chaning the definition of halo (really just clarifying it's meaning, see above, halo should be subtracted twice, instead

Ok, I actually think we shouldn't include the halo in the output_shape, see #54 (comment).
Maybe we should make a separate issue for this discussion...

constantinpape · 2020-12-04T16:49:51Z

Closing this PR in favor of #59

esgomezm and others added 4 commits November 27, 2020 17:34

Merge pull request #1 from bioimage-io/master

3a6da3e

Update repository

update readme.md

c42d585

typo

01a499a

Change percentiles to [0,1] range

b08cadf

Suggested by @Fynn at #43 (comment)

constantinpape reviewed Nov 29, 2020

View reviewed changes

FynnBe reviewed Nov 30, 2020

View reviewed changes

README.md Outdated Show resolved Hide resolved

FynnBe reviewed Nov 30, 2020

View reviewed changes

README.md Outdated Show resolved Hide resolved

FynnBe reviewed Nov 30, 2020

View reviewed changes

esgomezm and others added 2 commits November 30, 2020 09:56

Update README.md

6f6545d

Co-authored-by: FynnBe <[email protected]>

Update README.md

f176a12

Co-authored-by: FynnBe <[email protected]>

FynnBe reviewed Nov 30, 2020

View reviewed changes

constantinpape mentioned this pull request Nov 30, 2020

Mark fields depending on source as optional * #50

Merged

esgomezm mentioned this pull request Nov 30, 2020

Authors in the yaml and the weights #56

Closed

constantinpape mentioned this pull request Dec 4, 2020

Shape description and authors #59

Merged

constantinpape closed this Dec 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend description #54

Extend description #54

esgomezm commented Nov 27, 2020

constantinpape left a comment

constantinpape Nov 29, 2020

esgomezm Nov 29, 2020

constantinpape Nov 29, 2020

constantinpape Nov 29, 2020 •

edited

Loading

constantinpape Nov 29, 2020

esgomezm Nov 29, 2020

constantinpape Nov 29, 2020

FynnBe Nov 30, 2020

constantinpape Nov 30, 2020

FynnBe Nov 30, 2020

constantinpape Dec 4, 2020

FynnBe Nov 30, 2020

constantinpape Nov 30, 2020

FynnBe Nov 30, 2020

constantinpape Nov 30, 2020

FynnBe Nov 30, 2020

FynnBe Nov 30, 2020

constantinpape Nov 30, 2020

constantinpape commented Dec 4, 2020

	A model entry in the bioimage.io model zoo is defined by a configuration file `model.yaml`.
	A model entry in the bioimage.io model zoo is defined by a configuration file `<optional model name>.model.yaml`. Once a model is packaged to a `<(mandatory) model name>.model.bioimage.io.zip` file the configuration file inside the .zip is only named `model.yaml`.

	The output shape is determined as `output_shape = reference_input_shape * scale + offset`.
	The output shape is determined as `output_shape = reference_input_shape * scale - 2*halo.

Extend description #54

Extend description #54

Conversation

esgomezm commented Nov 27, 2020

constantinpape left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

constantinpape Nov 29, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

constantinpape commented Dec 4, 2020

constantinpape Nov 29, 2020 •

edited

Loading