-
-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add note to "Function Resolution" section about function argument and result types #686
Add note to "Function Resolution" section about function argument and result types #686
Conversation
spec/formatting.md
Outdated
Thus, formatting functions SHOULD use a structure for the resolved _operand_ value | ||
that is interconvertible with the structure for the result of the _function_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few observations:
-
It is misleading to refer to "formatting" functions here, given that their output may also be used for selection. Note how the text around this avoids that term.
-
I at least have never encountered the term "interconvertible", and using it here should be avoided.
-
There are cases where it makes sense for the operand type to be wider than the output type. For example, consider the resolution of
{$n :number}
, where the implicit input variable$n
has a string value'42'
. In a programming language like JS, it's easier for the custom function implementation to accept that its input may be a number, bigint, string, or object, rather than requiring each of those to come pre-wrapped as suggested by the SHOULD. See steps 6 and 7 here for the JS implementation details of this particular case.I do fully agree that the output of a function is expected to be in a shape that's acceptable as input, but I am not convinced that the input must always have that same shape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what you're trying to get at with your PRs for the post-45 period, @catamorphism. I recognize the problem here.
There is some question about whether the original operand value is available (transitive) or whether it becomes masked. That is, using @eemeli's example, {$n :number}
where $n
=="42", it seems reasonable that the value passed might be some number type. In a strongly typed language, this might be vary depending on the input or it might be a specifically expansive type like BigDecimal. That would be up to the implementer. Probably the original string is not available (well... you can get it through the original variable)
Perhaps:
Thus, formatting functions SHOULD use a structure for the resolved _operand_ value | |
that is interconvertible with the structure for the result of the _function_. | |
Thus, implementations SHOULD provide a means for _functions_ to expose | |
the resolved value of their _operand_ | |
and _functions_ SHOULD populate that mechanism | |
with a data structure or type consistent with the set of implementation-defined | |
types that they would support as input. | |
> For example, | |
> Suppose the value of the _variable_ `$n` were the string `1`. | |
> The resolved value of the _operand_ assigned to `$num` in the example | |
> below would be a numeric type (such as an `int` or `BigInteger` in Java). | |
>``` | |
> .input {$n :number} | |
> .local $num = {$n :integer} | |
>``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few observations:
1. It is misleading to refer to "formatting" functions here, given that their output may also be used for selection. Note how the text around this avoids that term.
I wrote it that way since selector functions don't have an output (or at least, not in the way that the "output" is being described here.) Though I'll look and see if Addison's suggested changes address that.
2. I at least have never encountered the term "interconvertible", and using it here should be avoided.
OK.
3. There are cases where it makes sense for the operand type to be wider than the output type. For example, consider the resolution of `{$n :number}`, where the implicit input variable `$n` has a string value `'42'`. In a programming language like JS, it's easier for the custom function implementation to accept that its input may be a number, bigint, string, or object, rather than requiring each of those to come pre-wrapped as suggested by the SHOULD. See steps 6 and 7 [here](https://tc39.es/proposal-intl-messageformat/#sec-messageformat-numberfunctions) for the JS implementation details of this particular case.
That seems to not be ruled out by my original text, since even if the number formatter will never return a string "42", there might be other formatting functions that do just return their inputs, in some cases.
The goal here is to make a statement about all custom functions, and I think the only logical thing we can say about all of them is that (ignoring options and context), they have the type signature T -> T
, for some T
. In JS, T
would mean something like number
∪ BigInt
∪ string
∪ object
. That wouldn't need to be explicitly written down in JS, but in C++ (for example), you do need a type to describe the interface that functions need to implement.
I'm not sure if this is clear, but one reason why it's hard to be precise about is that we're slipping between the object language sense of operand types and output types (as expressed in the specification of the function registry) and the meta-language sense (as expressed in an implementation's description of the calling conventions for functions).
I do fully agree that the output of a function is expected to be in a shape that's acceptable as input, but I am not convinced that the input must always have that same shape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps: [...]
I took some inspiration from this suggestion, but didn't use it exactly.
spec/formatting.md
Outdated
Since the result of a function call can be bound to a _variable_, | ||
the output of one _function_ may be the input of another _function_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use our own internal jargon here so that there is no confusion about what we're talking about. We should also avoid 2119 keywords, even if non-normatively formatted.
Since the result of a function call can be bound to a _variable_, | |
the output of one _function_ may be the input of another _function_. | |
A _local-declaration_ binds the output of an _expression_ to a _variable_, | |
thus the output of one _function_ is potentially the _operand_ of another. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a slightly different change instead (in order to avoid implying that an expression has output, which isn't really a concept in the spec); let me know what you think.
spec/formatting.md
Outdated
Thus, formatting functions SHOULD use a structure for the resolved _operand_ value | ||
that is interconvertible with the structure for the result of the _function_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what you're trying to get at with your PRs for the post-45 period, @catamorphism. I recognize the problem here.
There is some question about whether the original operand value is available (transitive) or whether it becomes masked. That is, using @eemeli's example, {$n :number}
where $n
=="42", it seems reasonable that the value passed might be some number type. In a strongly typed language, this might be vary depending on the input or it might be a specifically expansive type like BigDecimal. That would be up to the implementer. Probably the original string is not available (well... you can get it through the original variable)
Perhaps:
Thus, formatting functions SHOULD use a structure for the resolved _operand_ value | |
that is interconvertible with the structure for the result of the _function_. | |
Thus, implementations SHOULD provide a means for _functions_ to expose | |
the resolved value of their _operand_ | |
and _functions_ SHOULD populate that mechanism | |
with a data structure or type consistent with the set of implementation-defined | |
types that they would support as input. | |
> For example, | |
> Suppose the value of the _variable_ `$n` were the string `1`. | |
> The resolved value of the _operand_ assigned to `$num` in the example | |
> below would be a numeric type (such as an `int` or `BigInteger` in Java). | |
>``` | |
> .input {$n :number} | |
> .local $num = {$n :integer} | |
>``` |
In the 2024-02-26 call we agreed that a revision of this PR would be "last in" for 45-alpha. If you care about this issue, following @catamorphism's update (which should appear after this comment) you must comment on the proposed text before COB 2024-02-27 in the America/Los_Angeles time zone. Please note that we will not be fixing this issue with the adopted note. |
Avoid using the word "interconvertible" Include example of composability Include example for how the function interface would be defined in a typed implementation language Add note about multiple interpretations of composition that requests feedback
I made some significant changes -- let me know what you think, @aphillips @eemeli . This got pretty bulky, but I think it's necessary to avoid being so general as to be non-useful to implementors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think our comments should be more cautious, even though in general I think we are on the right track here. We also need to describe resolved value handling carefully.
spec/formatting.md
Outdated
Thus, the output of one _function_ is potentially the _operand_ | ||
of another _function_. In other words, formatting functions | ||
compose with each other. | ||
For example, in | ||
``` | ||
.input {$n :number minIntegerDigits=3} | ||
.local {$n1 :number maxFractionDigits=3} | ||
``` | ||
the second call to `:number` composes with the first call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think adding the idea of "compose" is going too far. We should be very conservative here. Also, this still has the "output" of a function in play. Perhaps:
Thus, the output of one _function_ is potentially the _operand_ | |
of another _function_. In other words, formatting functions | |
compose with each other. | |
For example, in | |
``` | |
.input {$n :number minIntegerDigits=3} | |
.local {$n1 :number maxFractionDigits=3} | |
``` | |
the second call to `:number` composes with the first call. | |
Thus, the _operand_ for one _function_ might be the resolved value | |
of another _function_. | |
Further, the _options_ for one _expression_ might affect the operation | |
of another. | |
> For example, if the value of the variable `n` were `1`: | |
> ``` | |
> .input {$n :number minimumFractionDigits=1} | |
> .local $num = {$n :number minimumIntegerDigits=3} | |
> .match {$num} | |
> * {{Prints 001.0 for {$num}}} | |
> ``` | |
> ... because the _options_ for the `.input` and `.local` are both applied to the value | |
> for the purposes of both formatting and selection. | |
> (Note that in English, fractional values match the plural rule `other`) |
Also note that the .local
in the original is incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed the indents in the above suggestion. Its example should be updated, because we should not be showing any matching (even if only with a single *
variant) on a non-integral number. Then could also drop the parenthetical and irrelevant bit about other
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the example could read:
.input {$n :number minimumFractionDigits=1}
.local $num = {$n :number minimumIntegerDigits=3}
{{Prints 001.0 for {$num}}}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind using the word "compose" here, but I'm also OK with dropping it and replacing with a description of the observed behavior, like @aphillips proposed.
I would, however, edit @aphillips's suggestion slightly:
-Thus, the _operand_ for one _function_ might be the resolved value
-of another _function_.
+Thus, the resolved value of one _function_ might be the _operand_
+or an _option_ value for one another _function_.
Further, the _options_ for one _expression_ might affect the operation of another.
This does two things:
- The subject of the sentence is the resolved value, similar to the second sentence.
- It's not only operands that can be resolved values of other expressions; option values can as well, as per our syntax and data model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we should not be showing any matching (even if only with a single * variant) on a non-integral number.
:number
really does need to do plural matching on fractions and this isn't a problem. Your concern is, I think, about exact matching, which I do not show. I actually think the .match
is important to call out precisely because the fraction bits happen to $n
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, fair point, I was confused. Including a one {{This is never selected}}
would clarify things a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOL. Actually, I had it originally, but removed it before submitting--I thought for clarity!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the resolved value of another function" doesn't really make sense here. The resolved value of a function is a thing that represents the function; the resolved value of a function applied to arguments (using more conventional declarative-language terminology) is something left unspecified, but is not a representation of a function itself. The spec doesn't really offer us the language for disambiguating the two.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, "the options for one expression might affect the operation of another" makes it sound like the language has some weird non-local side effects, which it does not. The options passed to a function affect its output (return value, etc.) and nothing else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if replacing the original example with the one with .match
clarifies the point, although more examples are generally a good thing. I wanted to show an example with two formatting functions, because that's where it might not be obvious that functions compose.
spec/formatting.md
Outdated
In addition, selector functions compose with formatting functions | ||
in the sense that a selector function's _operand_ | ||
may be the output of any formatting function. | ||
|
||
Implementations SHOULD provide a means for formatting functions | ||
to compose with each other | ||
and for formatting functions to compose with selector functions. | ||
Implementations that provide a means for defining custom functions | ||
SHOULD provide a means for those functions to return values | ||
that contain enough information | ||
(e.g. the resolved _operand_ and _option_ values | ||
that the function was called with) | ||
to be used as inputs to subsequent function calls. | ||
For example, an implementation in a typed programming language | ||
MAY define an interface that custom functions implement. | ||
Such an interface SHOULD define an implementation-specific | ||
argument type `T` and return type `U` for custom formatting functions | ||
such that `U` can be coerced to `T` without loss of information. | ||
The type `U` | ||
(or a type that `U` can be coerced to without loss of information) | ||
SHOULD also be the input type of custom selector functions. | ||
|
||
> [!NOTE] | ||
> In the Tech Preview, the spec leaves the behavior of the previous | ||
> example implementation-dependent. Supposing that | ||
> the external input variable `n` is bound to the string `"1"`, | ||
> and that the implementation formats to a string, | ||
> the formatted result of the following message: | ||
> | ||
> ``` | ||
> .input {$n :number minIntegerDigits=3} | ||
> .local {$n1 :number maxFractionDigits=3} | ||
> {{$n1}} | ||
> ``` | ||
> | ||
> is implementation-dependent. | ||
> Depending on whether the options are preserved across | ||
> the two calls to `:number`, a conformant implementation | ||
> could produce either "001.000" or "1.000" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave all of this out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. This is too specific.
We can really say very little (at this point) about what "resolved value" means. As far as I can tell, it means "value of a variable derived from an expression". The form of that variable is as determined by the implementation of the function.
We cannot require any particular internal structure for the RV, just how it behaves.
- If the RV is derived from an expression with a selection function X, it can match literal values (eg :number can match literals 0, 1, one, ...) producing a comparable value (aka relative weight).
- If the RV is derived from an expression with a formatting function X, it can produce a formatted string or "parts".
- Another function Y can use the RV as an operand, or as an option value. In these cases, it becomes clear that we want Y to be able to access information in RV. Exactly what that information is will depend on the expression that RV was derived from.
For #3, we have not delved into what the specification for outbound communication from an RV to an expression (as operand or option value) are. That is something to examine in the Tech Preview period. For example, for Stas's case, the RV might be able to supply the gender of a an RV representing a noun. I think part of the function registry needs to specify what information a variable derived from a expression using a function can supply, and (at least logically) how to access that. In some implementations, I could see having API so that $bar ={$foo :funct1 gender=$fii case=$fii} (logically) results in an internal call to $fii.get("gender") and a call to $fii.get("case").
For each function, the function registry needs to specify how that RVs deriving from that function behave in #1, #2, and #3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For #3, we have not delved into what the specification for outbound communication from an RV to an expression (as operand or option value) are.
I think this is a good way of putting it; as it is, the spec doesn't say anything about how to go from the return value of a function (as implemented in an underlying programming language) to an expression (in MessageFormat). If there's going to be a custom function interface at all, I don't know how to not specify that.
spec/formatting.md
Outdated
> Feedback from users and implementers is desired | ||
> about whether to require one interpretation or the other | ||
> in the spec. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I would highlight this...
> Feedback from users and implementers is desired | |
> about whether to require one interpretation or the other | |
> in the spec. | |
> [!NOTE] | |
> During the Technical Preview, feedback on how the registry | |
> describes how _functions_ inherit resolved values and _options_ | |
> and what requirements this specification should impose | |
> are highly desired. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm working on a revised version of this, but note that the word "inherit" should be completely off-limits since it connotes object-oriented inheritance, which would just confuse the issue.
spec/formatting.md
Outdated
In addition, selector functions compose with formatting functions | ||
in the sense that a selector function's _operand_ | ||
may be the output of any formatting function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mention in #686 (comment), we really should avoid "formatting function" and "selector function" as terms here, given that the resolved value of a function can theoretically be used for both.
This also applies to the next paragraph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @aphillips's points above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my mind, this PR sufficiently clarifies the expectations of the resolution mechanism to be useful to implementors and to allow early adopters to experiment with function composition.
Note that there's also a mention of resolved values in lines 106-118, similar to the wording in line 223:
message-format-wg/spec/formatting.md
Lines 106 to 118 in 9caacb6
The form that resolved values take is implementation-dependent, | |
and different implementations MAY choose to perform different levels of resolution. | |
> For example, the resolved value of the _expression_ `{|0.40| :number style=percent}` | |
> could be an object such as | |
> | |
> ``` | |
> { value: Number('0.40'), | |
> formatter: NumberFormat(locale, { style: 'percent' }) } | |
> ``` | |
> | |
> Alternatively, it could be an instance of an ICU4J `FormattedNumber`, | |
> or some other locally appropriate value. |
spec/formatting.md
Outdated
Thus, the output of one _function_ is potentially the _operand_ | ||
of another _function_. In other words, formatting functions | ||
compose with each other. | ||
For example, in | ||
``` | ||
.input {$n :number minIntegerDigits=3} | ||
.local {$n1 :number maxFractionDigits=3} | ||
``` | ||
the second call to `:number` composes with the first call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind using the word "compose" here, but I'm also OK with dropping it and replacing with a description of the observed behavior, like @aphillips proposed.
I would, however, edit @aphillips's suggestion slightly:
-Thus, the _operand_ for one _function_ might be the resolved value
-of another _function_.
+Thus, the resolved value of one _function_ might be the _operand_
+or an _option_ value for one another _function_.
Further, the _options_ for one _expression_ might affect the operation of another.
This does two things:
- The subject of the sentence is the resolved value, similar to the second sentence.
- It's not only operands that can be resolved values of other expressions; option values can as well, as per our syntax and data model.
spec/formatting.md
Outdated
Implementations that provide a means for defining custom functions | ||
SHOULD provide a means for those functions to return values | ||
that contain enough information | ||
(e.g. the resolved _operand_ and _option_ values | ||
that the function was called with) | ||
to be used as inputs to subsequent function calls. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the key part and I wouldn't want to drop it, as other reviewers suggested. Furthermore, I think we need to say something somewhere about resolved values. Otherwise, @aphillips's example from his suggestion above end up being the only place in the spec where we implicitly require that resolved values carry some extra information. I'd prefer to be explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we need to avoid being too specific about how it works. "return" has a specific meaning and I don't necessarily think it's a good idea to think of this as a function call's return value.
Our language is a declarative language and we call the stuff in an expression an "annotation" for a reason. Perhaps:
Implementations that provide a means for defining custom functions | |
SHOULD provide a means for those functions to return values | |
that contain enough information | |
(e.g. the resolved _operand_ and _option_ values | |
that the function was called with) | |
to be used as inputs to subsequent function calls. | |
When resolving the value of an _operand_ or other variable | |
(such as the value of an _option) | |
implementations SHOULD provide interfaces so that _annotation_ | |
applied in statements can accompany the value where appropriate. | |
Implementations of _functions_ SHOULD define whether they change the | |
value of the _operand_ in any way. | |
Implementations of _functions_ SHOULD define whether the value of | |
each _option_ is transitive or local. |
I mention "statements" here because .match
can be where the annotation is applied (not just .local
or .input
)
Some examples might help here. Here might be an example of non-transitive options (we might say that field options are non-transitive in the spec):
.input {$d :datetime weekday=short month=medium day=numeric}
.local $d1 = {$d :datetime hour=|2-digit| minute=numeric}
{{The transaction was on {$d} at {$d1}.}}
Here's a similar example:
.input {$d :datetime timeZone=|Europe/Paris|}
.local $date = {$d :datetime dateStyle=short}
.local $time = {$date :datetime timeStyle=short}
{{What does {$date} and {$time} print?}}
I think it is less surprising if $time
forgets the earlier style annotation but not the time zone.
We don't currently define any functions that change the value of an operand, but we certainly might do:
.local $regular = {|Addison| :string}
.local $shouted = {$regular :transform to=uppercase}
.match {$shouted}
ADDISON {{... is selected... }}
* {{ ... }}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point about "calling" and "returning".
(...) so that annotation applied in statements can accompany the value where appropriate.
This sounds OK, although I'm now not sure about the exact meaning of "the value" here. I realize that this was the whole point of @catamorphism's opening the other PR...
Btw. I think expressions would be more appropriate than statements — it's possible to annotate inside placeholders, too.
Implementations of functions SHOULD define whether they change the
value of the operand in any way.
Implementations of functions SHOULD define whether the value of
each option is transitive or local.
Maybe move this part to the note about seeking feedback? We don't really know yet what defining these constraints and requirements should look like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw. I think expressions would be more appropriate than statements — it's possible to annotate inside placeholders, too.
Yes, but annotations in placeholders are terminal.
I agree about "the value"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also like to avoid using the word "transitive", because transitivity is a property of a mathematical relation and we haven't defined any such relations in the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: this example:
.input {$d :datetime weekday=short month=medium day=numeric}
.local $d1 = {$d :datetime hour=|2-digit| minute=numeric}
{{The transaction was on {$d} at {$d1}.}}
It's not obvious to me that the options from the first :datetime
call shouldn't be preserved -- should $d1
be a formatted date with the union of all the options shown in both :datetime
annotations? Or just the hour
and minute
options and defaults for the others? To me there's no "obvious" answer, though maybe it's obvious to people who have more experience with message formatting.
It's certainly worth thinking about what the example means, but I'm not sure if it's the best example if the goal is to show something where certain options obviously shouldn't be preserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Change the value of the operand" should also be avoided, since (this being a purely functional language), functions never change the value of their operands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the whole, I understand the suggestion and think it's getting at something useful, but I don't see how it makes sense in the current framework of the spec, in which there is no data model for runtime values.
I think the first sentence ("...implementations SHOULD provide interfaces so that annotation applied in statements can accompany the value where appropriate.") is too vague, but I'm also not sure how to make it less vague. I would argue that the text in my revised commit is better because it's focused on functions, and the boundary between "inside the formatter" and "in a function implementation" is the one place where information is likely to get "lost".
I don't get the concept of "changing the value of the operand" even if I mentally replace that with "mapping the operand onto a new operand" or something like that. In your :transform
example, I would understand :transform
as returning something like this (if I borrow some of the machinery from #645 but simplify it for presentation):
AnnotatedFormattableValue {
source: AnnotatedFormattableValue { source: Formattable("Addison"), value: FormattedValue("Addison") },
formatter: "transform",
options: { "to": "uppercase" },
value: FormattedValue("ADDISON")
}
As with functions in other examples, it passes the same operand through (a wrapped thing ultimately representing the string "Addison"), and the transformed operand (the string "ADDISON") is the "formatted value" thing inside the structure representing the return value.
Obviously this isn't necessarily going to be the data model for runtime values, but still, it's not obvious to me that the result is just "ADDISON" with a bunch of options, rather than something that encapsulates the input "Addison", the output "ADDISON", and the options.
Finally, the sentence about defining options that are or aren't preserved (I would use "preserved" rather than "transitive" as I already said) is important, but we don't have a way in the registry to declare that info, currently.
spec/formatting.md
Outdated
.input {$n :number minIntegerDigits=3} | ||
.local {$n1 :number maxFractionDigits=3} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @aphillips noticed, the .local
needs a name. Making this suggestion to make sure we don't miss it, in case the other review comment isn't committed.
.input {$n :number minIntegerDigits=3} | |
.local {$n1 :number maxFractionDigits=3} | |
.input {$n :number minIntegerDigits=3} | |
.local $x = {$n1 :number maxFractionDigits=3} |
spec/formatting.md
Outdated
> In the Tech Preview, the spec leaves the behavior of the previous | ||
> example implementation-dependent. Supposing that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this PR still in scope of the Tech Preview? Or do you mean that it's implementation-dependent because of the SHOULD
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR's text is part of the Tech Preview.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to make clear that the 'return type' is not the 'thing bound to a variable'. The 'thing bound to a variable' also includes options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's one of the questions, though. If the contract with function implementations is that they are responsible for returning a thing containing all the options they want to preserve, then whatever a function returns is the thing bound to a variable (modulo possible lazy evaluation). If function implementations have no such responsibility, then yes, the formatter has to do additional processing to transform the "thing returned by a function" into the "thing bound to a variable".
spec/formatting.md
Outdated
In addition, selector functions compose with formatting functions | ||
in the sense that a selector function's _operand_ | ||
may be the output of any formatting function. | ||
|
||
Implementations SHOULD provide a means for formatting functions | ||
to compose with each other | ||
and for formatting functions to compose with selector functions. | ||
Implementations that provide a means for defining custom functions | ||
SHOULD provide a means for those functions to return values | ||
that contain enough information | ||
(e.g. the resolved _operand_ and _option_ values | ||
that the function was called with) | ||
to be used as inputs to subsequent function calls. | ||
For example, an implementation in a typed programming language | ||
MAY define an interface that custom functions implement. | ||
Such an interface SHOULD define an implementation-specific | ||
argument type `T` and return type `U` for custom formatting functions | ||
such that `U` can be coerced to `T` without loss of information. | ||
The type `U` | ||
(or a type that `U` can be coerced to without loss of information) | ||
SHOULD also be the input type of custom selector functions. | ||
|
||
> [!NOTE] | ||
> In the Tech Preview, the spec leaves the behavior of the previous | ||
> example implementation-dependent. Supposing that | ||
> the external input variable `n` is bound to the string `"1"`, | ||
> and that the implementation formats to a string, | ||
> the formatted result of the following message: | ||
> | ||
> ``` | ||
> .input {$n :number minIntegerDigits=3} | ||
> .local {$n1 :number maxFractionDigits=3} | ||
> {{$n1}} | ||
> ``` | ||
> | ||
> is implementation-dependent. | ||
> Depending on whether the options are preserved across | ||
> the two calls to `:number`, a conformant implementation | ||
> could produce either "001.000" or "1.000" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. This is too specific.
We can really say very little (at this point) about what "resolved value" means. As far as I can tell, it means "value of a variable derived from an expression". The form of that variable is as determined by the implementation of the function.
We cannot require any particular internal structure for the RV, just how it behaves.
- If the RV is derived from an expression with a selection function X, it can match literal values (eg :number can match literals 0, 1, one, ...) producing a comparable value (aka relative weight).
- If the RV is derived from an expression with a formatting function X, it can produce a formatted string or "parts".
- Another function Y can use the RV as an operand, or as an option value. In these cases, it becomes clear that we want Y to be able to access information in RV. Exactly what that information is will depend on the expression that RV was derived from.
For #3, we have not delved into what the specification for outbound communication from an RV to an expression (as operand or option value) are. That is something to examine in the Tech Preview period. For example, for Stas's case, the RV might be able to supply the gender of a an RV representing a noun. I think part of the function registry needs to specify what information a variable derived from a expression using a function can supply, and (at least logically) how to access that. In some implementations, I could see having API so that $bar ={$foo :funct1 gender=$fii case=$fii} (logically) results in an internal call to $fii.get("gender") and a call to $fii.get("case").
For each function, the function registry needs to specify how that RVs deriving from that function behave in #1, #2, and #3.
Implementations SHOULD provide a means for formatting functions | ||
to compose with each other | ||
and for formatting functions to compose with selector functions. | ||
Implementations that provide a means for defining custom functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the sense of this. We should also note somewhere that any two functions are not required to meaningfully compose; there is no requirement or expectation that the following make a meaningful composition:
.input {$date :datetime}
.local $person = {$date :x:personname}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to this. I'm not sure if I was able to explain this in yesterday's call, but I definitely agree that there should be no requirement for any two functions to meaningfully compose, as you've nicely put it. This is why I've been using the term cooperative composition (although I'm happy to call it something else), i.e. the functions must be aware of each other's interface in order to meaningfully compose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+100 although there are some different things going on here. In @macchiati's example, the types don't match. The .local
declaration should emit an Invalid Expression
error because $date
isn't of a supported type for :x:personname
(presumably).
In other cases, the types can match but the annotations might not be supported:
.input {$date :date style=long}
.local $foo = {$date :time} // not style=long because style means "dateStyle"
For now, we should basically say something like "functions can decide what operands to support" and "functions can decide what functions or function options to support". We permit "composition" without requiring it or prohibiting it.
In the default registry we can add (in the TP period) some guidance for date/time/datetime and number functions as a guide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
spec/formatting.md
Outdated
> In the Tech Preview, the spec leaves the behavior of the previous | ||
> example implementation-dependent. Supposing that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to make clear that the 'return type' is not the 'thing bound to a variable'. The 'thing bound to a variable' also includes options.
spec/formatting.md
Outdated
Such an interface SHOULD define an implementation-specific | ||
argument type `T` and return type `U` for custom formatting functions | ||
argument type `T` and return type `U` | ||
for implementations of formatting functions | ||
such that `U` can be coerced to `T` without loss of information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look at the following mutating function. Do we expect to be able to extract
.local $foo {$date :extractDay calendar=georgian}
.local $fii {$foo :extractMonth calendar=georgian}
Would the above text mean that :extractDay SHOULD have a "return type" such that it preserves the month value from $date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No; to be concrete, I'll use a simplified version of the C++ implementation.
:extractDay
and :extractMonths
would be implemented as classes that provide a format()
method. The type of that method (ignoring options and context, for simplicity) is:
FormattedPlaceholder format(FormattedPlaceholder&& argument);
A particular instance of the type FormattedPlaceholder
could contain any set of options, or none.
Sorry for talking in implementation terms, but I'm not sure how else to say it, given that my goal here is to suggest that implementors not do the "obvious" thing, which would be to define something like a FormatterInput
and FormatterOutput
type, which are not coercible to each other, and define the interface like:
FormatterOutput format(FormatterInput&& argument);
(I would expect every implementation to have some sort of interface between the formatter and calls to custom functions (I say "custom" because built-in functions could be handled inside the formatter), so I think it's meaningful to refer to it in this note. In a unityped implementation language like JavaScript, there's less of a hazard since there's only one type, trivially guaranteeing the property stated in the note.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still in an example and must not be normative. Even if it weren't an example, it's way too specific to put in the specification.
If we want to provide guidance to users/implementers, instead of pouring stuff into the spec, we should write some user guide material.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't suggesting the example be added to the spec, but rather using it to illustrate that
such that
U
can be coerced toT
without loss of information
is not something that should be in the spec.
In the hopes of making the discussion easier to follow, I'll summarize the feedback and how I did or didn't address it in 5803dd8:
There's one more comment from @aphillips that I didn't address yet; will do that in another commit (edit: not a commit, but a comment instead). |
Co-authored-by: Richard Gibson <[email protected]>
As a meta-comment, my objective with this PR is to help implementers not paint themselves into a corner, so while it would fit more naturally with the spec to use metalanguage (talking about annotations, expressions, resolved values, etc.) rather than object language, in the absence of the machinery that would be needed for that, I used the escape hatch of object language (e.g. "function implementations") instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this is good, but I've left one "Blocking" comment below that needs to be addressed before merging this. If that is done, please don't hesitate to dismiss this review when merging, as I might not be awake at end-of-day Pacific Time.
Co-authored-by: Eemeli Aro <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(as contributor)
I really believe that less is more here. We're getting into the weeds of how "functions" are "called" and what their "inputs" and "outputs" are. This is imperative programming thinking.
In my opinion, we should focus on calling out "here be dragons" and following up with carefully considered text across the spec (or with user guide like material)
spec/formatting.md
Outdated
the output of the first call to `:number` | ||
is the input of the second call to `:number`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is saying outputs and inputs, which, as I noted before, isn't correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to address this in 8a5f589
spec/formatting.md
Outdated
Such an interface SHOULD define an implementation-specific | ||
argument type `T` and return type `U` for custom formatting functions | ||
argument type `T` and return type `U` | ||
for implementations of formatting functions | ||
such that `U` can be coerced to `T` without loss of information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still in an example and must not be normative. Even if it weren't an example, it's way too specific to put in the specification.
If we want to provide guidance to users/implementers, instead of pouring stuff into the spec, we should write some user guide material.
...to bundle their results with a "parsed" version of their input
Co-authored-by: Eemeli Aro <[email protected]>
I don't quite know how to specify a foreign function interface without talking about inputs and outputs (or arguments and return values). This is not specific to imperative languages; all functional languages have a concept of function application, arguments, and return values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that my concerns about this PR have been sufficiently addressed.
(chair hat) I have two approvals, which is what I need to merge this. I'm going to make a last call in this comment, ignoring my own review. Any objection to merging this? |
Co-authored-by: Addison Phillips <[email protected]>
Co-authored-by: Addison Phillips <[email protected]>
Co-authored-by: Mark Davis <[email protected]>
Co-authored-by: Mark Davis <[email protected]>
Co-authored-by: Mark Davis <[email protected]>
@@ -222,6 +222,71 @@ the following steps are taken: | |||
|
|||
The form that resolved _operand_ and _option_ values take is implementation-defined. | |||
|
|||
A _declaration_ binds the resolved value of an _expression_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not asking for any change in the tech preview, but this will definitely need to be revised afterwards.
The whole notion of a 'resolved value' is very muddy, unless it literally means "what is bound to a variable by an expression declaration", which means it also includes selection/formatting options and their values whenever those are carried over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#645 is an attempt to address this. (But it could go even further.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after the most recent updates.
I acknowledge and agree that we need to spend more time and deliberation specifying this fully for 46. That said, I do think that the current wording does a good job of giving guidance to implementors and of stating that it's a work-in-progress.
Co-authored-by: Mark Davis <[email protected]>
Currently, the introduction to the spec states:
And the "Expression and Markup Resolution" section says:
However, the "Function Resolution" section is not as clear as it could be about the implications of these requirements for the interface with formatting functions.
I added some text that effectively implies that functions have the same operand type and result type. If this wasn't the case, it wouldn't make sense to bind the result of a function to a variable and use that result as an operand for another function call.
I think it's useful guidance for implementors to state this explicitly rather than letting it be inferred from the two existing passages that I quoted.
This relates to #515; some version of #645 would make this much more precise, but it's a start.
The reason this came up was that I was discussing the API for custom functions with members of the ICU TC, who were puzzled at first about why formatting functions take and return the same type (in my implementation).
If an implementor instead requires custom functions to take a "formattable" thing as an argument, and return a "formatted" thing, examples like the first use case in #515 (with the two calls to
:number
) wouldn't work. In my opinion, the current spec doesn't say that you can't do this -- it could be read as saying that a "resolved operand value" as mentioned in step 4, and the value referred to by the text "resolve the value of the expression as the result of that function call", might refer to different kinds of "resolved values".(Taking and returning the same type makes formatting functions seem more like "transformers" than "formatters" -- and that's in line with the text in syntax.md saying, "Functions are used to evaluate, format, select, or otherwise process data values during formatting." -- but changing the name might be more controversial.)