Add note to "Function Resolution" section about function argument and result types #686

catamorphism · 2024-02-23T00:35:52Z

Currently, the introduction to the spec states:

The form of the resolved value is implementation defined and the
value might not be evaluated or formatted yet.
However, it needs to be "formattable", i.e. it contains everything required
by the eventual formatting.

And the "Expression and Markup Resolution" section says:

Since a variable can be referenced in different ways later,
implementations SHOULD NOT immediately fully format the value for output.

However, the "Function Resolution" section is not as clear as it could be about the implications of these requirements for the interface with formatting functions.

I added some text that effectively implies that functions have the same operand type and result type. If this wasn't the case, it wouldn't make sense to bind the result of a function to a variable and use that result as an operand for another function call.

I think it's useful guidance for implementors to state this explicitly rather than letting it be inferred from the two existing passages that I quoted.

This relates to #515; some version of #645 would make this much more precise, but it's a start.

The reason this came up was that I was discussing the API for custom functions with members of the ICU TC, who were puzzled at first about why formatting functions take and return the same type (in my implementation).

If an implementor instead requires custom functions to take a "formattable" thing as an argument, and return a "formatted" thing, examples like the first use case in #515 (with the two calls to :number) wouldn't work. In my opinion, the current spec doesn't say that you can't do this -- it could be read as saying that a "resolved operand value" as mentioned in step 4, and the value referred to by the text "resolve the value of the expression as the result of that function call", might refer to different kinds of "resolved values".

(Taking and returning the same type makes formatting functions seem more like "transformers" than "formatters" -- and that's in line with the text in syntax.md saying, "Functions are used to evaluate, format, select, or otherwise process data values during formatting." -- but changing the name might be more controversial.)

… result types

eemeli · 2024-02-23T11:49:50Z

spec/formatting.md

+   Thus, formatting functions SHOULD use a structure for the resolved _operand_ value
+   that is interconvertible with the structure for the result of the _function_.


A few observations:

It is misleading to refer to "formatting" functions here, given that their output may also be used for selection. Note how the text around this avoids that term.

I at least have never encountered the term "interconvertible", and using it here should be avoided.

There are cases where it makes sense for the operand type to be wider than the output type. For example, consider the resolution of {$n :number}, where the implicit input variable $n has a string value '42'. In a programming language like JS, it's easier for the custom function implementation to accept that its input may be a number, bigint, string, or object, rather than requiring each of those to come pre-wrapped as suggested by the SHOULD. See steps 6 and 7 here for the JS implementation details of this particular case.

I do fully agree that the output of a function is expected to be in a shape that's acceptable as input, but I am not convinced that the input must always have that same shape.

This is what you're trying to get at with your PRs for the post-45 period, @catamorphism. I recognize the problem here.

There is some question about whether the original operand value is available (transitive) or whether it becomes masked. That is, using @eemeli's example, {$n :number} where $n=="42", it seems reasonable that the value passed might be some number type. In a strongly typed language, this might be vary depending on the input or it might be a specifically expansive type like BigDecimal. That would be up to the implementer. Probably the original string is not available (well... you can get it through the original variable)

Perhaps:

Suggested change

Thus, formatting functions SHOULD use a structure for the resolved _operand_ value

that is interconvertible with the structure for the result of the _function_.

Thus, implementations SHOULD provide a means for _functions_ to expose

the resolved value of their _operand_

and _functions_ SHOULD populate that mechanism

with a data structure or type consistent with the set of implementation-defined

types that they would support as input.

> For example,

> Suppose the value of the _variable_ `$n` were the string `1`.

> The resolved value of the _operand_ assigned to `$num` in the example

> below would be a numeric type (such as an `int` or `BigInteger` in Java).

>```

> .input {$n :number}

> .local $num = {$n :integer}

>```

A few observations:

1. It is misleading to refer to "formatting" functions here, given that their output may also be used for selection. Note how the text around this avoids that term.

I wrote it that way since selector functions don't have an output (or at least, not in the way that the "output" is being described here.) Though I'll look and see if Addison's suggested changes address that.

2. I at least have never encountered the term "interconvertible", and using it here should be avoided.

OK.

3. There are cases where it makes sense for the operand type to be wider than the output type. For example, consider the resolution of `{$n :number}`, where the implicit input variable `$n` has a string value `'42'`. In a programming language like JS, it's easier for the custom function implementation to accept that its input may be a number, bigint, string, or object, rather than requiring each of those to come pre-wrapped as suggested by the SHOULD. See steps 6 and 7 [here](https://tc39.es/proposal-intl-messageformat/#sec-messageformat-numberfunctions) for the JS implementation details of this particular case.

That seems to not be ruled out by my original text, since even if the number formatter will never return a string "42", there might be other formatting functions that do just return their inputs, in some cases.

The goal here is to make a statement about all custom functions, and I think the only logical thing we can say about all of them is that (ignoring options and context), they have the type signature T -> T, for some T. In JS, T would mean something like number ∪ BigInt ∪ string ∪ object. That wouldn't need to be explicitly written down in JS, but in C++ (for example), you do need a type to describe the interface that functions need to implement.

I'm not sure if this is clear, but one reason why it's hard to be precise about is that we're slipping between the object language sense of operand types and output types (as expressed in the specification of the function registry) and the meta-language sense (as expressed in an implementation's description of the calling conventions for functions).

I do fully agree that the output of a function is expected to be in a shape that's acceptable as input, but I am not convinced that the input must always have that same shape.

Perhaps: [...]

I took some inspiration from this suggestion, but didn't use it exactly.

aphillips · 2024-02-23T15:18:42Z

spec/formatting.md

+   Since the result of a function call can be bound to a _variable_,
+   the output of one _function_ may be the input of another _function_.


We should use our own internal jargon here so that there is no confusion about what we're talking about. We should also avoid 2119 keywords, even if non-normatively formatted.

Suggested change

Since the result of a function call can be bound to a _variable_,

the output of one _function_ may be the input of another _function_.

A _local-declaration_ binds the output of an _expression_ to a _variable_,

thus the output of one _function_ is potentially the _operand_ of another.

I made a slightly different change instead (in order to avoid implying that an expression has output, which isn't really a concept in the spec); let me know what you think.

aphillips · 2024-02-23T15:33:17Z

spec/formatting.md

+   Thus, formatting functions SHOULD use a structure for the resolved _operand_ value
+   that is interconvertible with the structure for the result of the _function_.


This is what you're trying to get at with your PRs for the post-45 period, @catamorphism. I recognize the problem here.

There is some question about whether the original operand value is available (transitive) or whether it becomes masked. That is, using @eemeli's example, {$n :number} where $n=="42", it seems reasonable that the value passed might be some number type. In a strongly typed language, this might be vary depending on the input or it might be a specifically expansive type like BigDecimal. That would be up to the implementer. Probably the original string is not available (well... you can get it through the original variable)

Perhaps:

Suggested change

Thus, formatting functions SHOULD use a structure for the resolved _operand_ value

that is interconvertible with the structure for the result of the _function_.

Thus, implementations SHOULD provide a means for _functions_ to expose

the resolved value of their _operand_

and _functions_ SHOULD populate that mechanism

with a data structure or type consistent with the set of implementation-defined

types that they would support as input.

> For example,

> Suppose the value of the _variable_ `$n` were the string `1`.

> The resolved value of the _operand_ assigned to `$num` in the example

> below would be a numeric type (such as an `int` or `BigInteger` in Java).

>```

> .input {$n :number}

> .local $num = {$n :integer}

>```

aphillips · 2024-02-26T19:56:51Z

In the 2024-02-26 call we agreed that a revision of this PR would be "last in" for 45-alpha. If you care about this issue, following @catamorphism's update (which should appear after this comment) you must comment on the proposed text before COB 2024-02-27 in the America/Los_Angeles time zone. Please note that we will not be fixing this issue with the adopted note.

Avoid using the word "interconvertible" Include example of composability Include example for how the function interface would be defined in a typed implementation language Add note about multiple interpretations of composition that requests feedback

catamorphism · 2024-02-26T20:54:20Z

I made some significant changes -- let me know what you think, @aphillips @eemeli .

This got pretty bulky, but I think it's necessary to avoid being so general as to be non-useful to implementors.

aphillips

I think our comments should be more cautious, even though in general I think we are on the right track here. We also need to describe resolved value handling carefully.

aphillips · 2024-02-27T04:47:39Z

spec/formatting.md

+   Thus, the output of one _function_ is potentially the _operand_
+   of another _function_. In other words, formatting functions
+   compose with each other.
+   For example, in
+   ```
+   .input {$n :number minIntegerDigits=3}
+   .local {$n1 :number maxFractionDigits=3}
+   ```
+   the second call to `:number` composes with the first call.


I think adding the idea of "compose" is going too far. We should be very conservative here. Also, this still has the "output" of a function in play. Perhaps:

Suggested change

Thus, the output of one _function_ is potentially the _operand_

of another _function_. In other words, formatting functions

compose with each other.

For example, in

```

.input {$n :number minIntegerDigits=3}

.local {$n1 :number maxFractionDigits=3}

```

the second call to `:number` composes with the first call.

Thus, the _operand_ for one _function_ might be the resolved value

of another _function_.

Further, the _options_ for one _expression_ might affect the operation

of another.

> For example, if the value of the variable `n` were `1`:

> ```

> .input {$n :number minimumFractionDigits=1}

> .local $num = {$n :number minimumIntegerDigits=3}

> .match {$num}

> * {{Prints 001.0 for {$num}}}

> ```

> ... because the _options_ for the `.input` and `.local` are both applied to the value

> for the purposes of both formatting and selection.

> (Note that in English, fractional values match the plural rule `other`)

Also note that the .local in the original is incorrect.

Fixed the indents in the above suggestion. Its example should be updated, because we should not be showing any matching (even if only with a single * variant) on a non-integral number. Then could also drop the parenthetical and irrelevant bit about other.

So the example could read:

.input {$n :number minimumFractionDigits=1} .local $num = {$n :number minimumIntegerDigits=3} {{Prints 001.0 for {$num}}}

I don't mind using the word "compose" here, but I'm also OK with dropping it and replacing with a description of the observed behavior, like @aphillips proposed.

I would, however, edit @aphillips's suggestion slightly:

-Thus, the _operand_ for one _function_ might be the resolved value -of another _function_. +Thus, the resolved value of one _function_ might be the _operand_ +or an _option_ value for one another _function_. Further, the _options_ for one _expression_ might affect the operation of another.

This does two things:

The subject of the sentence is the resolved value, similar to the second sentence.

It's not only operands that can be resolved values of other expressions; option values can as well, as per our syntax and data model.

because we should not be showing any matching (even if only with a single * variant) on a non-integral number.

:number really does need to do plural matching on fractions and this isn't a problem. Your concern is, I think, about exact matching, which I do not show. I actually think the .match is important to call out precisely because the fraction bits happen to $n

Ah, fair point, I was confused. Including a one {{This is never selected}} would clarify things a bit.

LOL. Actually, I had it originally, but removed it before submitting--I thought for clarity!

"the resolved value of another function" doesn't really make sense here. The resolved value of a function is a thing that represents the function; the resolved value of a function applied to arguments (using more conventional declarative-language terminology) is something left unspecified, but is not a representation of a function itself. The spec doesn't really offer us the language for disambiguating the two.

Also, "the options for one expression might affect the operation of another" makes it sound like the language has some weird non-local side effects, which it does not. The options passed to a function affect its output (return value, etc.) and nothing else.

I'm not sure if replacing the original example with the one with .match clarifies the point, although more examples are generally a good thing. I wanted to show an example with two formatting functions, because that's where it might not be obvious that functions compose.

aphillips · 2024-02-27T04:48:21Z

spec/formatting.md

+   In addition, selector functions compose with formatting functions
+   in the sense that a selector function's _operand_
+   may be the output of any formatting function.
+
+   Implementations SHOULD provide a means for formatting functions
+   to compose with each other
+   and for formatting functions to compose with selector functions.
+   Implementations that provide a means for defining custom functions
+   SHOULD provide a means for those functions to return values
+   that contain enough information
+   (e.g. the resolved _operand_ and _option_ values
+   that the function was called with)
+   to be used as inputs to subsequent function calls.
+   For example, an implementation in a typed programming language
+   MAY define an interface that custom functions implement.
+   Such an interface SHOULD define an implementation-specific
+   argument type `T` and return type `U` for custom formatting functions
+   such that `U` can be coerced to `T` without loss of information.
+   The type `U`
+   (or a type that `U` can be coerced to without loss of information)
+   SHOULD also be the input type of custom selector functions.
+
+> [!NOTE]
+> In the Tech Preview, the spec leaves the behavior of the previous
+> example implementation-dependent. Supposing that
+> the external input variable `n` is bound to the string `"1"`,
+> and that the implementation formats to a string,
+> the formatted result of the following message:
+>
+> ```
+> .input {$n :number minIntegerDigits=3}
+> .local {$n1 :number maxFractionDigits=3}
+> {{$n1}}
+> ```
+>
+> is implementation-dependent.
+> Depending on whether the options are preserved across
+> the two calls to `:number`, a conformant implementation
+> could produce either "001.000" or "1.000"


I would leave all of this out.

Agreed. This is too specific.

We can really say very little (at this point) about what "resolved value" means. As far as I can tell, it means "value of a variable derived from an expression". The form of that variable is as determined by the implementation of the function.

We cannot require any particular internal structure for the RV, just how it behaves.

If the RV is derived from an expression with a selection function X, it can match literal values (eg :number can match literals 0, 1, one, ...) producing a comparable value (aka relative weight).

If the RV is derived from an expression with a formatting function X, it can produce a formatted string or "parts".

Another function Y can use the RV as an operand, or as an option value. In these cases, it becomes clear that we want Y to be able to access information in RV. Exactly what that information is will depend on the expression that RV was derived from.

For #3, we have not delved into what the specification for outbound communication from an RV to an expression (as operand or option value) are. That is something to examine in the Tech Preview period. For example, for Stas's case, the RV might be able to supply the gender of a an RV representing a noun. I think part of the function registry needs to specify what information a variable derived from a expression using a function can supply, and (at least logically) how to access that. In some implementations, I could see having API so that $bar ={$foo :funct1 gender=$fii case=$fii} (logically) results in an internal call to $fii.get("gender") and a call to $fii.get("case").

For each function, the function registry needs to specify how that RVs deriving from that function behave in #1, #2, and #3.

For #3, we have not delved into what the specification for outbound communication from an RV to an expression (as operand or option value) are.

I think this is a good way of putting it; as it is, the spec doesn't say anything about how to go from the return value of a function (as implemented in an underlying programming language) to an expression (in MessageFormat). If there's going to be a custom function interface at all, I don't know how to not specify that.

aphillips · 2024-02-27T04:50:51Z

spec/formatting.md

+> Feedback from users and implementers is desired
+> about whether to require one interpretation or the other
+> in the spec.


And I would highlight this...

Suggested change

> Feedback from users and implementers is desired

> about whether to require one interpretation or the other

> in the spec.

> [!NOTE]

> During the Technical Preview, feedback on how the registry

> describes how _functions_ inherit resolved values and _options_

> and what requirements this specification should impose

> are highly desired.

I'm working on a revised version of this, but note that the word "inherit" should be completely off-limits since it connotes object-oriented inheritance, which would just confuse the issue.

eemeli · 2024-02-27T07:15:05Z

spec/formatting.md

+   In addition, selector functions compose with formatting functions
+   in the sense that a selector function's _operand_
+   may be the output of any formatting function.


As I mention in #686 (comment), we really should avoid "formatting function" and "selector function" as terms here, given that the resolved value of a function can theoretically be used for both.

This also applies to the next paragraph.

eemeli

I agree with @aphillips's points above.

stasm

In my mind, this PR sufficiently clarifies the expectations of the resolution mechanism to be useful to implementors and to allow early adopters to experiment with function composition.

Note that there's also a mention of resolved values in lines 106-118, similar to the wording in line 223:

message-format-wg/spec/formatting.md

Lines 106 to 118 in 9caacb6

    
           The form that resolved values take is implementation-dependent, 
        
           and different implementations MAY choose to perform different levels of resolution. 
        
           > For example, the resolved value of the _expression_ `{|0.40| :number style=percent}` 
        
           > could be an object such as 
        
           > 
        
           > ``` 
        
           > { value: Number('0.40'), 
        
           >   formatter: NumberFormat(locale, { style: 'percent' }) } 
        
           > ``` 
        
           > 
        
           > Alternatively, it could be an instance of an ICU4J `FormattedNumber`, 
        
           > or some other locally appropriate value.

stasm · 2024-02-27T09:09:17Z

spec/formatting.md

+   Thus, the output of one _function_ is potentially the _operand_
+   of another _function_. In other words, formatting functions
+   compose with each other.
+   For example, in
+   ```
+   .input {$n :number minIntegerDigits=3}
+   .local {$n1 :number maxFractionDigits=3}
+   ```
+   the second call to `:number` composes with the first call.


I don't mind using the word "compose" here, but I'm also OK with dropping it and replacing with a description of the observed behavior, like @aphillips proposed.

I would, however, edit @aphillips's suggestion slightly:

-Thus, the _operand_ for one _function_ might be the resolved value -of another _function_. +Thus, the resolved value of one _function_ might be the _operand_ +or an _option_ value for one another _function_. Further, the _options_ for one _expression_ might affect the operation of another.

This does two things:

The subject of the sentence is the resolved value, similar to the second sentence.

It's not only operands that can be resolved values of other expressions; option values can as well, as per our syntax and data model.

stasm · 2024-02-27T09:18:42Z

spec/formatting.md

+   Implementations that provide a means for defining custom functions
+   SHOULD provide a means for those functions to return values
+   that contain enough information
+   (e.g. the resolved _operand_ and _option_ values
+   that the function was called with)
+   to be used as inputs to subsequent function calls.


This is the key part and I wouldn't want to drop it, as other reviewers suggested. Furthermore, I think we need to say something somewhere about resolved values. Otherwise, @aphillips's example from his suggestion above end up being the only place in the spec where we implicitly require that resolved values carry some extra information. I'd prefer to be explicit.

But we need to avoid being too specific about how it works. "return" has a specific meaning and I don't necessarily think it's a good idea to think of this as a function call's return value.

Our language is a declarative language and we call the stuff in an expression an "annotation" for a reason. Perhaps:

Suggested change

Implementations that provide a means for defining custom functions

SHOULD provide a means for those functions to return values

that contain enough information

(e.g. the resolved _operand_ and _option_ values

that the function was called with)

to be used as inputs to subsequent function calls.

When resolving the value of an _operand_ or other variable

(such as the value of an _option)

implementations SHOULD provide interfaces so that _annotation_

applied in statements can accompany the value where appropriate.

Implementations of _functions_ SHOULD define whether they change the

value of the _operand_ in any way.

Implementations of _functions_ SHOULD define whether the value of

each _option_ is transitive or local.

I mention "statements" here because .match can be where the annotation is applied (not just .local or .input)

Some examples might help here. Here might be an example of non-transitive options (we might say that field options are non-transitive in the spec):

.input {$d :datetime weekday=short month=medium day=numeric} .local $d1 = {$d :datetime hour=|2-digit| minute=numeric} {{The transaction was on {$d} at {$d1}.}}

Here's a similar example:

.input {$d :datetime timeZone=|Europe/Paris|} .local $date = {$d :datetime dateStyle=short} .local $time = {$date :datetime timeStyle=short} {{What does {$date} and {$time} print?}}

I think it is less surprising if $time forgets the earlier style annotation but not the time zone.

We don't currently define any functions that change the value of an operand, but we certainly might do:

.local $regular = {|Addison| :string} .local $shouted = {$regular :transform to=uppercase} .match {$shouted} ADDISON {{... is selected... }} * {{ ... }}

I see your point about "calling" and "returning".

(...) so that annotation applied in statements can accompany the value where appropriate.

This sounds OK, although I'm now not sure about the exact meaning of "the value" here. I realize that this was the whole point of @catamorphism's opening the other PR...

Btw. I think expressions would be more appropriate than statements — it's possible to annotate inside placeholders, too.

Implementations of functions SHOULD define whether they change the
value of the operand in any way.
Implementations of functions SHOULD define whether the value of
each option is transitive or local.

Maybe move this part to the note about seeking feedback? We don't really know yet what defining these constraints and requirements should look like.

Btw. I think expressions would be more appropriate than statements — it's possible to annotate inside placeholders, too.

Yes, but annotations in placeholders are terminal.

I agree about "the value"

I would also like to avoid using the word "transitive", because transitivity is a property of a mathematical relation and we haven't defined any such relations in the spec.

Re: this example:

.input {$d :datetime weekday=short month=medium day=numeric} .local $d1 = {$d :datetime hour=|2-digit| minute=numeric} {{The transaction was on {$d} at {$d1}.}}

It's not obvious to me that the options from the first :datetime call shouldn't be preserved -- should $d1 be a formatted date with the union of all the options shown in both :datetime annotations? Or just the hour and minute options and defaults for the others? To me there's no "obvious" answer, though maybe it's obvious to people who have more experience with message formatting.

It's certainly worth thinking about what the example means, but I'm not sure if it's the best example if the goal is to show something where certain options obviously shouldn't be preserved.

"Change the value of the operand" should also be avoided, since (this being a purely functional language), functions never change the value of their operands.

On the whole, I understand the suggestion and think it's getting at something useful, but I don't see how it makes sense in the current framework of the spec, in which there is no data model for runtime values.

I think the first sentence ("...implementations SHOULD provide interfaces so that annotation applied in statements can accompany the value where appropriate.") is too vague, but I'm also not sure how to make it less vague. I would argue that the text in my revised commit is better because it's focused on functions, and the boundary between "inside the formatter" and "in a function implementation" is the one place where information is likely to get "lost".

I don't get the concept of "changing the value of the operand" even if I mentally replace that with "mapping the operand onto a new operand" or something like that. In your :transform example, I would understand :transform as returning something like this (if I borrow some of the machinery from #645 but simplify it for presentation):

AnnotatedFormattableValue { source: AnnotatedFormattableValue { source: Formattable("Addison"), value: FormattedValue("Addison") }, formatter: "transform", options: { "to": "uppercase" }, value: FormattedValue("ADDISON") }

As with functions in other examples, it passes the same operand through (a wrapped thing ultimately representing the string "Addison"), and the transformed operand (the string "ADDISON") is the "formatted value" thing inside the structure representing the return value.

Obviously this isn't necessarily going to be the data model for runtime values, but still, it's not obvious to me that the result is just "ADDISON" with a bunch of options, rather than something that encapsulates the input "Addison", the output "ADDISON", and the options.

Finally, the sentence about defining options that are or aren't preserved (I would use "preserved" rather than "transitive" as I already said) is important, but we don't have a way in the registry to declare that info, currently.

stasm · 2024-02-27T09:21:43Z

spec/formatting.md

+   .input {$n :number minIntegerDigits=3}
+   .local {$n1 :number maxFractionDigits=3}


As @aphillips noticed, the .local needs a name. Making this suggestion to make sure we don't miss it, in case the other review comment isn't committed.

Suggested change

.input {$n :number minIntegerDigits=3}

.local {$n1 :number maxFractionDigits=3}

.input {$n :number minIntegerDigits=3}

.local $x = {$n1 :number maxFractionDigits=3}

stasm · 2024-02-27T09:24:57Z

spec/formatting.md

+> In the Tech Preview, the spec leaves the behavior of the previous
+> example implementation-dependent. Supposing that


Isn't this PR still in scope of the Tech Preview? Or do you mean that it's implementation-dependent because of the SHOULD?

This PR's text is part of the Tech Preview.

We need to make clear that the 'return type' is not the 'thing bound to a variable'. The 'thing bound to a variable' also includes options.

That's one of the questions, though. If the contract with function implementations is that they are responsible for returning a thing containing all the options they want to preserve, then whatever a function returns is the thing bound to a variable (modulo possible lazy evaluation). If function implementations have no such responsibility, then yes, the formatter has to do additional processing to transform the "thing returned by a function" into the "thing bound to a variable".

macchiati · 2024-02-27T16:34:28Z

spec/formatting.md

+   In addition, selector functions compose with formatting functions
+   in the sense that a selector function's _operand_
+   may be the output of any formatting function.
+
+   Implementations SHOULD provide a means for formatting functions
+   to compose with each other
+   and for formatting functions to compose with selector functions.
+   Implementations that provide a means for defining custom functions
+   SHOULD provide a means for those functions to return values
+   that contain enough information
+   (e.g. the resolved _operand_ and _option_ values
+   that the function was called with)
+   to be used as inputs to subsequent function calls.
+   For example, an implementation in a typed programming language
+   MAY define an interface that custom functions implement.
+   Such an interface SHOULD define an implementation-specific
+   argument type `T` and return type `U` for custom formatting functions
+   such that `U` can be coerced to `T` without loss of information.
+   The type `U`
+   (or a type that `U` can be coerced to without loss of information)
+   SHOULD also be the input type of custom selector functions.
+
+> [!NOTE]
+> In the Tech Preview, the spec leaves the behavior of the previous
+> example implementation-dependent. Supposing that
+> the external input variable `n` is bound to the string `"1"`,
+> and that the implementation formats to a string,
+> the formatted result of the following message:
+>
+> ```
+> .input {$n :number minIntegerDigits=3}
+> .local {$n1 :number maxFractionDigits=3}
+> {{$n1}}
+> ```
+>
+> is implementation-dependent.
+> Depending on whether the options are preserved across
+> the two calls to `:number`, a conformant implementation
+> could produce either "001.000" or "1.000"


Agreed. This is too specific.

We can really say very little (at this point) about what "resolved value" means. As far as I can tell, it means "value of a variable derived from an expression". The form of that variable is as determined by the implementation of the function.

We cannot require any particular internal structure for the RV, just how it behaves.

If the RV is derived from an expression with a selection function X, it can match literal values (eg :number can match literals 0, 1, one, ...) producing a comparable value (aka relative weight).

If the RV is derived from an expression with a formatting function X, it can produce a formatted string or "parts".

Another function Y can use the RV as an operand, or as an option value. In these cases, it becomes clear that we want Y to be able to access information in RV. Exactly what that information is will depend on the expression that RV was derived from.

For #3, we have not delved into what the specification for outbound communication from an RV to an expression (as operand or option value) are. That is something to examine in the Tech Preview period. For example, for Stas's case, the RV might be able to supply the gender of a an RV representing a noun. I think part of the function registry needs to specify what information a variable derived from a expression using a function can supply, and (at least logically) how to access that. In some implementations, I could see having API so that $bar ={$foo :funct1 gender=$fii case=$fii} (logically) results in an internal call to $fii.get("gender") and a call to $fii.get("case").

For each function, the function registry needs to specify how that RVs deriving from that function behave in #1, #2, and #3.

macchiati · 2024-02-27T18:14:27Z

spec/formatting.md

+   Implementations SHOULD provide a means for formatting functions
+   to compose with each other
+   and for formatting functions to compose with selector functions.
+   Implementations that provide a means for defining custom functions


I agree with the sense of this. We should also note somewhere that any two functions are not required to meaningfully compose; there is no requirement or expectation that the following make a meaningful composition:

.input {$date :datetime} .local $person = {$date :x:personname}

+1 to this. I'm not sure if I was able to explain this in yesterday's call, but I definitely agree that there should be no requirement for any two functions to meaningfully compose, as you've nicely put it. This is why I've been using the term cooperative composition (although I'm happy to call it something else), i.e. the functions must be aware of each other's interface in order to meaningfully compose.

+100 although there are some different things going on here. In @macchiati's example, the types don't match. The .local declaration should emit an Invalid Expression error because $date isn't of a supported type for :x:personname (presumably).

In other cases, the types can match but the annotations might not be supported:

.input {$date :date style=long} .local $foo = {$date :time} // not style=long because style means "dateStyle"

For now, we should basically say something like "functions can decide what operands to support" and "functions can decide what functions or function options to support". We permit "composition" without requiring it or prohibiting it.

In the default registry we can add (in the TP period) some guidance for date/time/datetime and number functions as a guide.

macchiati · 2024-02-27T18:18:46Z

spec/formatting.md

+> In the Tech Preview, the spec leaves the behavior of the previous
+> example implementation-dependent. Supposing that


We need to make clear that the 'return type' is not the 'thing bound to a variable'. The 'thing bound to a variable' also includes options.

macchiati · 2024-02-27T18:46:34Z

spec/formatting.md

   Such an interface SHOULD define an implementation-specific
-   argument type `T` and return type `U` for custom formatting functions
+   argument type `T` and return type `U`
+   for implementations of formatting functions
   such that `U` can be coerced to `T` without loss of information.


Look at the following mutating function. Do we expect to be able to extract

.local $foo {$date :extractDay calendar=georgian}
.local $fii {$foo :extractMonth calendar=georgian}

Would the above text mean that :extractDay SHOULD have a "return type" such that it preserves the month value from $date?

No; to be concrete, I'll use a simplified version of the C++ implementation.

:extractDay and :extractMonths would be implemented as classes that provide a format() method. The type of that method (ignoring options and context, for simplicity) is:

FormattedPlaceholder format(FormattedPlaceholder&& argument);

A particular instance of the type FormattedPlaceholder could contain any set of options, or none.

Sorry for talking in implementation terms, but I'm not sure how else to say it, given that my goal here is to suggest that implementors not do the "obvious" thing, which would be to define something like a FormatterInput and FormatterOutput type, which are not coercible to each other, and define the interface like:

FormatterOutput format(FormatterInput&& argument);

(I would expect every implementation to have some sort of interface between the formatter and calls to custom functions (I say "custom" because built-in functions could be handled inside the formatter), so I think it's meaningful to refer to it in this note. In a unityped implementation language like JavaScript, there's less of a hazard since there's only one type, trivially guaranteeing the property stated in the note.)

This is still in an example and must not be normative. Even if it weren't an example, it's way too specific to put in the specification.

If we want to provide guidance to users/implementers, instead of pouring stuff into the spec, we should write some user guide material.

I wasn't suggesting the example be added to the spec, but rather using it to illustrate that

such that U can be coerced to T without loss of information

is not something that should be in the spec.

catamorphism · 2024-02-27T18:51:34Z

In the hopes of making the discussion easier to follow, I'll summarize the feedback and how I did or didn't address it in 5803dd8:

Fixed the syntax of the example
Eliminated "compose" and "composability" per @aphillips
In language referring to return values, changed "function" to "function implementation" to make it clear that we're speaking in object language rather than metalanguage. I don't know how to express these properties in metalanguage. I don't think this imposes any requirements on the internal structure of resolved values in the implementation. It does make recommendations about the interface between custom functions and the formatter, but I don't see how we can avoid specifying that interface.
For now, I kept in the passage that @aphillips suggested dropping, but rewrote it to refer to function implementations rather than functions.
Generalized references to formatting/selection functions to just "functions" (where possible), per @eemeli
Added text saying that the result of one call can also be the option value of another call, per @stasm
Added text reiterating that function implementations are free to signal errors for any unhandled input, per @macchiati 's comment about "functions are not required to meaningfully compose" (unfortunately, not using "compose" means I can't say it quite so elegantly).

There's one more comment from @aphillips that I didn't address yet; will do that in another commit (edit: not a commit, but a comment instead).

spec/formatting.md

Co-authored-by: Richard Gibson <[email protected]>

catamorphism · 2024-02-27T19:29:07Z

As a meta-comment, my objective with this PR is to help implementers not paint themselves into a corner, so while it would fit more naturally with the spec to use metalanguage (talking about annotations, expressions, resolved values, etc.) rather than object language, in the absence of the machinery that would be needed for that, I used the escape hatch of object language (e.g. "function implementations") instead.

eemeli

Overall this is good, but I've left one "Blocking" comment below that needs to be addressed before merging this. If that is done, please don't hesitate to dismiss this review when merging, as I might not be awake at end-of-day Pacific Time.

spec/formatting.md

Co-authored-by: Eemeli Aro <[email protected]>

aphillips

(as contributor)

I really believe that less is more here. We're getting into the weeds of how "functions" are "called" and what their "inputs" and "outputs" are. This is imperative programming thinking.

In my opinion, we should focus on calling out "here be dragons" and following up with carefully considered text across the spec (or with user guide like material)

aphillips · 2024-02-27T20:10:18Z

spec/formatting.md

+   the output of the first call to `:number`
+   is the input of the second call to `:number`.


This is saying outputs and inputs, which, as I noted before, isn't correct.

I tried to address this in 8a5f589

spec/formatting.md

aphillips · 2024-02-27T20:15:57Z

spec/formatting.md

   Such an interface SHOULD define an implementation-specific
-   argument type `T` and return type `U` for custom formatting functions
+   argument type `T` and return type `U`
+   for implementations of formatting functions
   such that `U` can be coerced to `T` without loss of information.


This is still in an example and must not be normative. Even if it weren't an example, it's way too specific to put in the specification.

If we want to provide guidance to users/implementers, instead of pouring stuff into the spec, we should write some user guide material.

spec/formatting.md

...to bundle their results with a "parsed" version of their input

Co-authored-by: Eemeli Aro <[email protected]>

catamorphism · 2024-02-27T20:35:24Z

(as contributor)

I really believe that less is more here. We're getting into the weeds of how "functions" are "called" and what their "inputs" and "outputs" are. This is imperative programming thinking.

In my opinion, we should focus on calling out "here be dragons" and following up with carefully considered text across the spec (or with user guide like material)

I don't quite know how to specify a foreign function interface without talking about inputs and outputs (or arguments and return values). This is not specific to imperative languages; all functional languages have a concept of function application, arguments, and return values.

eemeli

I believe that my concerns about this PR have been sufficiently addressed.

aphillips · 2024-02-27T20:41:30Z

(chair hat)

I have two approvals, which is what I need to merge this. I'm going to make a last call in this comment, ignoring my own review. Any objection to merging this?

Co-authored-by: Addison Phillips <[email protected]>

spec/formatting.md

Co-authored-by: Mark Davis <[email protected]>

spec/formatting.md

Co-authored-by: Mark Davis <[email protected]>

macchiati · 2024-02-27T21:05:09Z

spec/formatting.md

@@ -222,6 +222,71 @@ the following steps are taken:

   The form that resolved _operand_ and _option_ values take is implementation-defined.

+   A _declaration_ binds the resolved value of an _expression_


Not asking for any change in the tech preview, but this will definitely need to be revised afterwards.
The whole notion of a 'resolved value' is very muddy, unless it literally means "what is bound to a variable by an expression declaration", which means it also includes selection/formatting options and their values whenever those are carried over.

#645 is an attempt to address this. (But it could go even further.)

spec/formatting.md

stasm

LGTM after the most recent updates.

I acknowledge and agree that we need to spend more time and deliberation specifying this fully for 46. That said, I do think that the current wording does a good job of giving guidance to implementors and of stating that it's a work-in-progress.

Co-authored-by: Mark Davis <[email protected]>

Add note to "Function Resolution" section about function argument and…

e3f8a3b

… result types

catamorphism added the Agenda+ Requested for upcoming teleconference label Feb 23, 2024

catamorphism requested review from aphillips, stasm, eemeli and mihnita February 23, 2024 00:35

eemeli reviewed Feb 23, 2024

View reviewed changes

aphillips requested changes Feb 23, 2024

View reviewed changes

aphillips added normative Issue affects normative text in the specification formatting labels Feb 23, 2024

aphillips added Action-Item Action item assigned by the WG fast-track Non-spec editorial changes, etc. and removed Agenda+ Requested for upcoming teleconference labels Feb 26, 2024

catamorphism added 3 commits February 26, 2024 12:21

Rephrase the 'result of a function call' sentence

eb397b1

Add specific text on selector functions

2c7aaeb

aphillips requested a review from eemeli February 27, 2024 04:37

aphillips requested changes Feb 27, 2024

View reviewed changes

eemeli reviewed Feb 27, 2024

View reviewed changes

stasm approved these changes Feb 27, 2024

View reviewed changes

Fix .local syntax in example

9353c1f

macchiati reviewed Feb 27, 2024

View reviewed changes

Address various review comments

5803dd8

macchiati reviewed Feb 27, 2024

View reviewed changes

gibson042 reviewed Feb 27, 2024

View reviewed changes

spec/formatting.md Outdated Show resolved Hide resolved

Update spec/formatting.md

0f41bfc

Co-authored-by: Richard Gibson <[email protected]>

eemeli requested changes Feb 27, 2024

View reviewed changes

spec/formatting.md Outdated Show resolved Hide resolved

spec/formatting.md Outdated Show resolved Hide resolved

spec/formatting.md Outdated Show resolved Hide resolved

Update spec/formatting.md

6e7d1c2

Co-authored-by: Eemeli Aro <[email protected]>

aphillips reviewed Feb 27, 2024

View reviewed changes

catamorphism and others added 2 commits February 27, 2024 12:23

Add the words "a representation of" to allow implementations...

23c032a

...to bundle their results with a "parsed" version of their input

Update spec/formatting.md

632f9a9

Co-authored-by: Eemeli Aro <[email protected]>

eemeli approved these changes Feb 27, 2024

View reviewed changes

catamorphism and others added 3 commits February 27, 2024 12:42

Avoid use of 'input'/'output'

8a5f589

Update spec/formatting.md

723941f

Co-authored-by: Addison Phillips <[email protected]>

Update spec/formatting.md

bd7bf09

Co-authored-by: Addison Phillips <[email protected]>

macchiati reviewed Feb 27, 2024

View reviewed changes

spec/formatting.md Outdated Show resolved Hide resolved

macchiati reviewed Feb 27, 2024

View reviewed changes

spec/formatting.md Outdated Show resolved Hide resolved

catamorphism and others added 2 commits February 27, 2024 12:58

Update spec/formatting.md

39000d2

Co-authored-by: Mark Davis <[email protected]>

Update spec/formatting.md

5f664cf

Co-authored-by: Mark Davis <[email protected]>

macchiati reviewed Feb 27, 2024

View reviewed changes

spec/formatting.md Outdated Show resolved Hide resolved

Update spec/formatting.md

6a3a965

Co-authored-by: Mark Davis <[email protected]>

macchiati reviewed Feb 27, 2024

View reviewed changes

macchiati requested changes Feb 27, 2024

View reviewed changes

spec/formatting.md Outdated Show resolved Hide resolved

stasm approved these changes Feb 27, 2024

View reviewed changes

Update spec/formatting.md

7ccec7c

Co-authored-by: Mark Davis <[email protected]>

eemeli requested a review from macchiati February 27, 2024 22:04

macchiati approved these changes Feb 27, 2024

View reviewed changes

aphillips merged commit 56fcc84 into unicode-org:main Feb 28, 2024
1 check passed

catamorphism mentioned this pull request Mar 13, 2024

Spec test requires different functions to compose with each other #726

Closed

		Thus, formatting functions SHOULD use a structure for the resolved _operand_ value
		that is interconvertible with the structure for the result of the _function_.

-   Thus, formatting functions SHOULD use a structure for the resolved _operand_ value
-   that is interconvertible with the structure for the result of the _function_.
+Thus, implementations SHOULD provide a means for _functions_ to  expose
+the resolved value of their _operand_
+and _functions_ SHOULD populate that mechanism
+with a data structure or type consistent with the set of implementation-defined
+types that they would support as input.
+> For example,
+> Suppose the value of the _variable_ `$n` were the string `1`.
+> The resolved value of the _operand_ assigned to `$num` in the example
+> below would be a numeric type (such as an `int` or `BigInteger` in Java).
+>```
+> .input {$n :number}
+> .local $num = {$n :integer}
+>```

		Since the result of a function call can be bound to a _variable_,
		the output of one _function_ may be the input of another _function_.

-   Thus, the output of one _function_ is potentially the _operand_
-   of another _function_. In other words, formatting functions
-   compose with each other.
-   For example, in
-   ```
-   .input {$n :number minIntegerDigits=3}
-   .local {$n1 :number maxFractionDigits=3}
-   ```
-   the second call to `:number` composes with the first call.
+   Thus, the _operand_ for one _function_ might be the resolved value
+   of another _function_.
+   Further, the _options_ for one _expression_ might affect the operation
+   of another.
+   > For example, if the value of the variable `n` were `1`:
+   > ```
+   > .input {$n :number minimumFractionDigits=1}
+   > .local $num = {$n :number minimumIntegerDigits=3}
+   > .match {$num}
+   > *   {{Prints 001.0 for {$num}}}
+   > ```
+   > ... because the _options_ for the `.input` and `.local` are both applied to the value
+   > for the purposes of both formatting and selection.
+   > (Note that in English, fractional values match the plural rule `other`)

	The form that resolved values take is implementation-dependent,
	and different implementations MAY choose to perform different levels of resolution.

	> For example, the resolved value of the _expression_ `{\|0.40\| :number style=percent}`
	> could be an object such as
	>
	> ```
	> { value: Number('0.40'),
	> formatter: NumberFormat(locale, { style: 'percent' }) }
	> ```
	>
	> Alternatively, it could be an instance of an ICU4J `FormattedNumber`,
	> or some other locally appropriate value.

-   Implementations that provide a means for defining custom functions
-   SHOULD provide a means for those functions to return values
-   that contain enough information
-   (e.g. the resolved _operand_ and _option_ values
-   that the function was called with)
-   to be used as inputs to subsequent function calls.
+When resolving the value of an _operand_ or other variable
+(such as the value of an _option)
+implementations SHOULD provide interfaces so that _annotation_
+applied in statements can accompany the value where appropriate.
+Implementations of _functions_ SHOULD define whether they change the
+value of the _operand_ in any way.
+Implementations of _functions_ SHOULD define whether the value of
+each _option_ is transitive or local.

		.input {$n :number minIntegerDigits=3}
		.local {$n1 :number maxFractionDigits=3}

		> In the Tech Preview, the spec leaves the behavior of the previous
		> example implementation-dependent. Supposing that

		the output of the first call to `:number`
		is the input of the second call to `:number`.

		@@ -222,6 +222,71 @@ the following steps are taken:

		The form that resolved _operand_ and _option_ values take is implementation-defined.

		A _declaration_ binds the resolved value of an _expression_

Add note to "Function Resolution" section about function argument and result types #686

Add note to "Function Resolution" section about function argument and result types #686

Conversation

catamorphism commented Feb 23, 2024

Choose a reason for hiding this comment

aphillips Feb 23, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aphillips Feb 23, 2024 • edited Loading

Choose a reason for hiding this comment

aphillips commented Feb 26, 2024

catamorphism commented Feb 26, 2024

aphillips left a comment

Choose a reason for hiding this comment

aphillips Feb 27, 2024 • edited by eemeli Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eemeli left a comment

Choose a reason for hiding this comment

stasm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

macchiati Feb 27, 2024 • edited Loading

Choose a reason for hiding this comment

catamorphism commented Feb 27, 2024 • edited Loading

catamorphism commented Feb 27, 2024

eemeli left a comment

Choose a reason for hiding this comment

aphillips left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

catamorphism commented Feb 27, 2024

eemeli left a comment

Choose a reason for hiding this comment

aphillips commented Feb 27, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stasm left a comment • edited Loading

Choose a reason for hiding this comment

aphillips Feb 23, 2024 •

edited

Loading

aphillips Feb 23, 2024 •

edited

Loading

aphillips Feb 27, 2024 •

edited by eemeli

Loading

macchiati Feb 27, 2024 •

edited

Loading

catamorphism commented Feb 27, 2024 •

edited

Loading

stasm left a comment •

edited

Loading