-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend description #54
Conversation
Update repository
Suggested by @Fynn at #43 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @esgomezm.
Changes look mostly good, we should just clarify the definition of percentile.
@@ -133,13 +143,13 @@ For example: | |||
|
|||
- `weights` The weights for this model. Weights can be given for different formats, but should otherwise be equivalent. | |||
- `weight_format` Format of this set of weights. Weight formats can define additional (optional or required) fields. See [supported_formats_and_operations.md#Weight Format](https://github.com/bioimage-io/configuration/blob/master/supported_formats_and_operations.md#weight_format) | |||
- `[authors]` a list of authors. | |||
- `[authors]` a list of authors. It refers to the ones who trained the model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think It refers to the ones who trained the model
is not quite correct here, because the optional authors
field here indicates to who create this set of weights, which could also happened by converting the weights from one format to another (so not by training the model).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I thought those authors were the ones in the main description as it says
The authors are the creators of the specifications and the primary points of contact.
. I would expect that the ones who created the specification are also the ones responsible for the weights format.
Maybe we could agree on a general and more specific description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the idea is that authors
in the main one are the ones that created the model and uploaded the model.yaml
etc.
After that happens someone else might convert the weights to a different format and then add this to the same model.yaml
. In this case, they should be listed here (because they are different from the "main" authors).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I agree, we should explain this in more detail)
- `min_percentile` the lower percentile used for normalization, in range 0 to 100. | ||
- `max_percentile` the upper percentile used for normalization, in range 0 to 100. Has to be bigger than `upper_percentile`. | ||
- `min_percentile` the lower percentile used for normalization, in range 0 to 1 | ||
- `max_percentile` the upper percentile used for normalization, in range 0 to 1. Has to be bigger than `upper_percentile`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
percentile usually refers to a range from 0 to 100, not 0 to 1.
See https://en.wikipedia.org/wiki/Percentile.
I think it would be confusing if we change this from the common definition.
@frauzufall how do you handle this in CSBDeep
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, @FynnBe did not suggest to change the value range to [0, 1]
. Instead he suggested to enforce max_percentile > 1
in order to avoid confusion about the value ranges.
@@ -62,12 +62,22 @@ Must be a list of *tensor specification keys*. | |||
*tensor specification keys*: | |||
- `name` tensor name | |||
- `data_type` data type (e.g. float32) | |||
- `data_range` tuple of (minimum, maximum) | |||
- `axes` string of axes identifying characters from: btczyx | |||
- `data_range` tuple of (minimum, maximum) (e.g. [-inf, inf]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a hunch: Did you remove the inf example, because you expect data_range
to hold the actual minimum, maximum of the tensor?
I think for now it was intended as the threoretical data_range (that might be given by the data_type
), which the example clarifies. (a comment might be clearer, though). In combination with the data type this is of limited use, but the actual (minimum,maximum) would confusing due to the batch dimension, which is arbitrary and thus has a random influence on these values. The actual, global data range might not be required, thus unnecessary to be computed, or not sufficient (e.g. for percentile normalization). I suggest not to change anything further for 0.3.0, and keep it as theoretical data range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think putting in the "data range so that the model yields sensible results" makes sense for the data range:
#55 (comment)
@@ -12,7 +12,7 @@ To get a quick overview of the config file, see an example file [here](https://g | |||
|
|||
## Model Specification | |||
|
|||
A model entry in the bioimage.io model zoo is defined by a configuration file `<model name>.model.yaml`. | |||
A model entry in the bioimage.io model zoo is defined by a configuration file `model.yaml`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A model entry in the bioimage.io model zoo is defined by a configuration file `model.yaml`. | |
A model entry in the bioimage.io model zoo is defined by a configuration file `<optional model name>.model.yaml`. Once a model is packaged to a `<(mandatory) model name>.model.bioimage.io.zip` file the configuration file inside the .zip is only named `model.yaml`. |
The model entry is not the same as the packaged model (where we agreed to model.yaml
inside a <model name>.model.bioimage.io.zip
, see #41). Keeping several models in one folder should be possible, as the may share other resources, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, I am not sure we should mention this here, it's difficult to understand this without explaining the ...model.zip
(which we are not doing yet, but need to add here as well.)
- `reference_input` *input tensor name* | ||
- `scale` *list of factors 'output_pix/input_pix' for each dimension* | ||
- `offset` *position of origin wrt to input* | ||
- `halo` *the halo to crop from the output tensor* (for example to crop away boundary effects or for tiling) (e.g. the valid domain of an output of `size = [256, 256]` and `halo = [32, 32]`, results in `output_shape = [256 - 2 * 32, 256 - 2 * 32] = [192, 192]`).\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my perception the halo
border is actually not cropped away, but returned. However, your description of halo
makes more sense. We should never return anything invalid. I saw a problem with 'same' convolutional models, that do return the same output size but have a certain halo of (more or less) invalid border data. I suppose those models should incorporate a postprocessing crop and only return valid data, turning the same convolution model into a valid one. @constantinpape ,@m-novikov, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my perception the
halo
border is actually not cropped away, but returned.
But what's the difference then of having a halo or not?
The way I understand halo here is that it's a hint to the consumer runing prediction with this model, that indicates which "outer region" of the prediction should be discarded, for example due to boundary artifacts caused by same convolutions. (but there could be lots of other reasons for doing this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But what's the difference then of having a halo or not?
I am confused, as to me you answer this question yourself:
The way I understand halo here is that it's a hint to the consumer running prediction with this model, that indicates which "outer region" of the prediction should be discarded, for example due to boundary artifacts caused by same convolutions. (but there could be lots of other reasons for doing this)
This is how I understood it, too. I have to say though, that I like @esgomezm definition more that forces you to crop the outputs that contain a halo (either through valid convs or postprocessing), to instead return only valid output. Every consumer software should just tile accordingly and would no longer have to deal with the two options of 1. I ignore the halo and look at ugly border artefacts or 2. I crop the halo from what I got back from the runner. It would amount to option 3: I have to tile correctly (overlapping if model has halo), but what I get back is all valid :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, so now I understand the issue here, I think.
I think we all agree on the principle definition of halo, but the question is who's responsible for cropping it.
- It's cropped in the mode already.
- It's cropped by the consumer.
I don't think option 1 makes so much sense, because then we wouldn't need to specify the halo in the first place, because we can just use the predictions from the model in the first place.
But if we go with option 2, this implies that we don't change the output shape when a halo is specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think option 1 makes so much sense, because then we wouldn't need to specify the halo in the first place, because we can just use the predictions from the model in the first place.
right, if we do want to handle it this way, we could just move it into output shape
, as in this case it's only applicable when calculating the actual output from a reference input
Co-authored-by: FynnBe <[email protected]>
Co-authored-by: FynnBe <[email protected]>
- `scale` *list of factors 'output_pix/input_pix' for each dimension* | ||
- `offset` *position of origin wrt to input* | ||
- `halo` *the halo to crop from the output tensor* (for example to crop away boundary effects or for tiling) (e.g. the valid domain of an output of `size = [256, 256]` and `halo = [32, 32]`, results in `output_shape = [256 - 2 * 32, 256 - 2 * 32] = [192, 192]`).\ | ||
The output shape is determined as `output_shape = reference_input_shape * scale + offset`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The output shape is determined as `output_shape = reference_input_shape * scale + offset`. | |
The output shape is determined as `output_shape = reference_input_shape * scale - 2*halo. |
in the description of offset
it says it describes the relative position to the reference input, slighly chaning the definition of halo (really just clarifying it's meaning, see above, halo should be subtracted twice, instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I actually think we shouldn't include the halo in the output_shape
, see #54 (comment).
Maybe we should make a separate issue for this discussion...
Closing this PR in favor of #59 |
Main changes: