[DESIGN] dataflow for composability (#515) #645

catamorphism · 2024-02-14T01:43:03Z

This proposed design doc addresses an issue titled "inspecting formattable values", which is really about dataflow through the formatter and structuring it to make function calls compose with each other.

Another PR, #646 , shows how the spec would change if this design doc was accepted. I made changes to the design doc after making the changes to the formatting spec in #646, and didn't have time to update the formatting spec accordingly (yet), so while #646 uses some different terms, it should still serve to give a sense of what the spec would look like if this design doc was accepted.

This is not a finished design doc by any means, but I'm hoping to get a thumbs-down or thumbs-up on the idea before I polish it any further.

This proposed design doc addresses an issue titled "inspecting formattable values", which is really about dataflow through the formatter and structuring it to make function calls compose with each other.

This change updates the formatting spec to reflect the changes proposed in unicode-org#645. It should not be merged as-is. It also uses slightly different terms than the design doc in unicode-org#645, but should serve to give a sense of what the spec would look like if the "composability" design doc was accepted.

eemeli · 2024-02-14T08:44:02Z

I am not comfortable attempting to review this design doc and the accompanying spec changes during this week. The changes they propose are extensive, but at the same time it's not clear to me if they actually change any externally observable behaviour.

As in #515, the doc starts from a premise that

Custom formatting functions should be able to inspect the raw value and formatting options of their arguments.

but does not discuss why access specifically the a "raw" value (and options?) is required or beneficial, as opposed to a value and options. I am concerned that such a starting point limits the expressibility of e.g. messages such as:

.input {$names :list}
.local $head = {$names :slice start=0 end=2}
.local $tail = {$names :slice start=2}
.match {$head :count} {$tail :count}
0   *   {{No-one liked this}}
one *   {{{$head} liked this}}
*   0   {{{$head} liked this}}
*   one {{{$head :list type=unit} and {$tail :count} other person liked this}}
*   *   {{{$head :list type=unit} and {$tail :count} other people liked this}}

where a complex input value (a list of names) is used to construct other complex values, on which further operations (determining the item count and adding list formatting options) result in selection and formatting.

It's entirely possible that the above works fine with the proposed text, but this current week does not provide sufficient time for making that determination for this particular case, or for other complex messages.

I'm also a little surprised by the number of new interfaces that are introduced. For the JS implementation, I found it sufficient to have a single MessageValue interface representing what the current spec text refers to as a "resolved value".

aphillips

(chair hat off)

I tend to agree with @eemeli. Actually, I'd go further.

This is great thinking and shows off the fact that you have been writing an implementation. However, I think it's too specific. This would require implementations to write their internals in specific ways (and we'd have to create tests to prove that the implementation had followed the spec). Since what we care about are outputs, our guidance can and should be limited to external appearances.

Most of the guidance here seems targeted at behavior we could specify for the built-in functions (e.g. whether options are transitive between declarations) rather than normative definitions.

Let's discuss today and see where others in the WG land.

aphillips · 2024-02-14T15:19:30Z

exploration/dataflow-composability.md

+But without `:number` being able to access the previously passed options,
+the two fragments won't be equivalent.
+This requires `:number` to return a value that encodes
+the options that were passed in, the value that was passed in,
+and the formatted result;
+not just the formatted result.


Did you consider the counter example, in which the user does NOT want transitivity?

.input $datetime = {|2024-02-14T01:23:45Z| :datetime} .local $date = {$datetime :date} .local $time = {$datetime :time}

.local $a = {|1.234 :number maximumFractionDigits=2 maximumDecimalDigits=0} .local $b = {$a :integer} {{This prints 1.23: {$b}.{$a} (Yes, its perverse)}}

stasm · 2024-02-15T14:51:37Z

@aphillips:

However, I think it's too specific. This would require implementations to write their internals in specific ways (and we'd have to create tests to prove that the implementation had followed the spec).

I actually think this is rather agnostic, which is why it may also seem like it introduces a lot of new interfaces, as @eemeli observed.

I gave @catamorphism feedback on an earlier draft of this PR, and I'd like to have more time to review this iteration, but I won't be able to get to it this week.

I'd love to be able to continue discussing this post-LDML 45. Perhaps this is part of the "implementer's feedback" that we're looking for?

In the meantime and in the 45 timeframe, would it make sense to review #646 looking for changes which would introduce incompatibilities to the current spec? If all #646 does is clarify concepts without changing the observable formatting behavior, then I think it should be acceptable to continue this work for possible inclusion in the final spec?

catamorphism · 2024-02-19T19:00:25Z

I'll plan on reading through the formatting spec to look for anything that would make it impossible to adopt this proposal later. (Busy week for me so I'm not sure when, but I'll do my best.)

macchiati · 2024-03-13T20:23:24Z

exploration/dataflow-composability.md

+- Define the structure passed in as an argument to a custom formatting function.
+- Define the structure that a custom formatting function should return.
+- Maintain the options passed into the callee as a _separate_ argument to the
+  formatter, to avoid confusion. (See Example 4 below.)


I had somewhat different thoughts.

.input {$v1 :f1 o1=a...} .local $v2 {$v1 :f2 o1=a o2=b}

f1 == f2. For the same function (or aliases with different options), do a merge on the option map (with the later ones winning). This could be extended to alias functions (the same under the covers, with different option settings).

f1 ≠ f2. For different functions, it gets tricky. I think we need to require it to be specified in the registry.

Example:
.input {$v1 :number o1:a}
.local $v2 {$v1 :date o1=a o2=b}

The registry could specify that :date can handle a resolved :number expression in the following way.

The value of the :date expression is the number converted to a double value and interpreted as the number of seconds since 1970-01-01T00:00:00Z.

The number operands are discarded except for o3=x, which is mapped to o9 in the following way: ...

@macchiati noted:

I had somewhat different thoughts.

My thoughts have evolved here also. I think it isn't the function or isn't just the function that determines whether an option is inherited. I think options should be specified as inheritable (or not).

Consider:

.input {$d :datetime year=numeric month=long day=numeric} .local $t = {$d :datetime hour=numeric minute=numeric} {{The event is on {$d} at {$t}.}}

These are the same input variables and the same function. You don't want to inherit skeleton/fields.

Now consider:

.input {$d :datetime dateStyle=long timeZone=$userZone numberingSystem=Latn} .local $t = {$d :datetime timeStyle=long} {{The event is on {$d} at {$t}.}}

Again, you don't want to inherit the styles. But you do want to inherit the coerced time zone and the numbering system (because it makes more sense to inherit it).

The registry could specify that :date can handle a resolved :number expression in the following way.

The registry can already specify that. And implementations can already do that, as we allow implementation defined types. The :number options won't matter to :date (etc.) and not be inherited.

Note that :datetime, :date, and :time are not the same function (f1 ≠ f2) but probably inherit/share some fields when "composed" as in the examples above (replace :datetime with :date and :time for example). The current registry doesn't do this because the potentially shared fields were pushed down into the RGI bucket for 45.

I think it depends on the example. In the following, I do want to inherit.

.input {$d :datetime year=numeric month=numeric day=numeric} .local $d2 = {$d :datetime month=long} {{The event is on {$d} at {$d2}.}}

What I think would be cleaner would be to have options like suppress=date, suppress=time. These cause a host of options to be set to none. Then the following would be very clear as to what is to happen, especially for the poor reader who doesn't want to have to consult the registry for each attribute to find out whether it inherits or not.

.input {$d :datetime year=numeric month=long day=numeric} .local $t = {$d :datetime suppress=date hour=numeric minute=numeric} {{The event is on {$d} at {$t}.}}

I don't have time right now to comment in depth, but I really want to discourage, again, the use of the word "inherit" here, since options are not related to object-oriented inheritance. Even using it metaphorically is bound to cause confusion in people who aren't in this discussion now but may join later.

I think the right thing to do instead is to define data types that standard functions return, and exhaustively specify the required and optional fields in those data types; if an option doesn't appear there, then it can't be used in composing functions. (And then, custom function writers get a way to define these types for themselves.)

There is no escaping that we're talking about describing dynamic data in a static way, so this is like defining a "data model" for runtime values to make it practical to understand and describe how functions interoperate. In such a model, options are passed as components of a data structure; not inherited.

@catamorphism You're right. The word inherit probably shouldn't be used in the spec.

@macchiati

.local $d2 = {$d :datetime month=long}

Actually, that's ambiguous. It's not clear if your intention for $d2 is to produce January or January 1. From my point of view, if you mess with the format, it's the former (or you would have added the day option). Here's another place skeletons are the friendlier interface.

What I think would be cleaner would be to have options like suppress=date, suppress=time.

This is what the functions :date and :time do 😜

especially for the poor reader who doesn't want to have to consult the registry for each attribute to find out whether it inherits or not.

This is a concern. But here in the design doc we should capture the use cases for options. We should also capture the use cases for operand mutation. For example, given my given name as input for $name:

.input {$name :text-transform operation=uppercase} .local $foo = {$name :truncate length=5 ellipsis=true} {{Your name is {$foo}.}}

Does that print "Your name is ADDIS..." or "Your name is Addis..." or what?

Does that print "Your name is ADDIS..." or "Your name is Addis..." or what?

I think that must depend on the :text-transform and :truncate implementations, and how their authors intend for the values to compose. I think we must allow for each function to make choices when determining their resolved values about the "value" and "options" they present.

Something very similar is achievable even with our most basic built-in functions, and without options:

.local $x = {4.2e1 :number} {{{$x :string}}}

If :number presents its input directly as its value, that should format as "4.2e1", but if it presents a numeric value, then that should format as "42". I do not think enforcing the first on a spec level makes much sense.

Similarly, a "text-transform" could effectively consume its operation=uppercase option when constructing its value, or consider it a string formatting option. So either of these are conceivable in the resolved value:

Value 'Addison', options { operation: 'uppercase' }

Value 'ADDISON', options {}

As we've discussed in a variety of contexts, not all options are really the same, so I'd say that enabling this freedom is the right choice.

mihnita · 2024-05-12T21:51:11Z

exploration/dataflow-composability.md

+and the structure of resolved values is left completely
+implementation-specific.
+
+Providing a mechanism for custom formatters to inspect more


mihnita · 2024-05-12T22:08:31Z

I think this document focuses too much (only?) on the idea of the same function merging options or not.

But I don't think that is very interesting, or useful.
It only saves some typing.
And I don't think that using variations of the same parameter (with different options) are that common.

What is the most interesting is composing functions by "chaining" them.
Functions that take one type + options and return a different type (or the same type), with new, or possibly modified options.

That would allow (for example) to do transformations on parameters.

Take a person and return date of birth.
Take a date and return days since that date.

Take a string and return a transformed version of it (changing case, or grammatical form).
Or return the original string, but with extra info attached as option (for example result of a grammatical analysis).

catamorphism · 2024-05-13T22:19:40Z

@mihnita That's fair; the document does focus a lot on composing options because that's the problem that comes up first with built-in functions. (Composing number with datetime, or vice versa, isn't too interesting.) There is an example in #753 ("Example B1") of what you're talking about. Since #753 is intended to land before this PR lands, do you think that's enough or would you prefer to see more examples like that?

catamorphism added 2 commits February 13, 2024 17:40

Add design doc for dataflow for composability (unicode-org#515)

315ce35

This proposed design doc addresses an issue titled "inspecting formattable values", which is really about dataflow through the formatter and structuring it to make function calls compose with each other.

Add pull request link

11c3493

catamorphism mentioned this pull request Feb 14, 2024

Update spec as if PR #645 was accepted #646

Closed

Add link to spec PR

faec2ad

catamorphism marked this pull request as ready for review February 14, 2024 02:00

catamorphism requested review from aphillips, eemeli, stasm and mihnita February 14, 2024 02:01

Add correct image for second diagram

a66b4c0

aphillips reviewed Feb 14, 2024

View reviewed changes

aphillips added the Agenda+ Requested for upcoming teleconference label Feb 14, 2024

aphillips added normative Issue affects normative text in the specification specification LDML46 LDML46 Release (Tech Preview - October 2024) and removed Agenda+ Requested for upcoming teleconference labels Feb 15, 2024

This was referenced Feb 22, 2024

Define resolved value formally #678

Closed

Add note to "Function Resolution" section about function argument and result types #686

Merged

This was referenced Mar 13, 2024

Spec test requires different functions to compose with each other #726

Closed

Clear up some ambiguities in terminology in formatting.md #723

Closed

macchiati reviewed Mar 13, 2024

View reviewed changes

eemeli mentioned this pull request Mar 14, 2024

Add Resolved Values and Function Handler sections to formatting #728

Merged

aphillips mentioned this pull request Mar 19, 2024

Match expressions should be re-usable in placeholders #736

Closed

catamorphism mentioned this pull request Mar 26, 2024

Add design doc on function composition #753

Merged

aphillips added the design Design principles, decisions label Apr 15, 2024

aphillips changed the title ~~Add design doc for dataflow for composability (#515)~~ [DESIGN] dataflow for composability (#515) Apr 15, 2024

catamorphism added 2 commits May 6, 2024 12:44

Add one more alternative

9a77841

Add another alternative

79ceb57

mihnita reviewed May 12, 2024

View reviewed changes

aphillips merged commit 76a676c into unicode-org:main May 20, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DESIGN] dataflow for composability (#515) #645

[DESIGN] dataflow for composability (#515) #645

catamorphism commented Feb 14, 2024 •

edited

Loading

eemeli commented Feb 14, 2024

aphillips left a comment

aphillips Feb 14, 2024

stasm commented Feb 15, 2024

catamorphism commented Feb 19, 2024

macchiati Mar 13, 2024

aphillips Mar 13, 2024

macchiati Mar 13, 2024

catamorphism Mar 13, 2024 •

edited

Loading

aphillips Mar 14, 2024

eemeli Mar 14, 2024

mihnita May 12, 2024

mihnita commented May 12, 2024

catamorphism commented May 13, 2024

[DESIGN] dataflow for composability (#515) #645

[DESIGN] dataflow for composability (#515) #645

Conversation

catamorphism commented Feb 14, 2024 • edited Loading

eemeli commented Feb 14, 2024

aphillips left a comment

Choose a reason for hiding this comment

aphillips Feb 14, 2024

Choose a reason for hiding this comment

stasm commented Feb 15, 2024

catamorphism commented Feb 19, 2024

macchiati Mar 13, 2024

Choose a reason for hiding this comment

aphillips Mar 13, 2024

Choose a reason for hiding this comment

macchiati Mar 13, 2024

Choose a reason for hiding this comment

catamorphism Mar 13, 2024 • edited Loading

Choose a reason for hiding this comment

aphillips Mar 14, 2024

Choose a reason for hiding this comment

eemeli Mar 14, 2024

Choose a reason for hiding this comment

mihnita May 12, 2024

Choose a reason for hiding this comment

mihnita commented May 12, 2024

catamorphism commented May 13, 2024

catamorphism commented Feb 14, 2024 •

edited

Loading

catamorphism Mar 13, 2024 •

edited

Loading