Create notes-2024-11-25.md

unicode-org · Nov 25, 2024 · ec9089d · ec9089d
1 parent 55e166a
commit ec9089d
Showing 1 changed file with 223 additions and 0 deletions.
diff --git a/meetings/2024/notes-2024-11-25.md b/meetings/2024/notes-2024-11-25.md
@@ -0,0 +1,223 @@
+# 25 November 2024 | MessageFormat Working Group Teleconference
+
+### Attendees
+
+- Addison Phillips \-Unicode (APP) \- chair  
+- Mihai Niță \- Google (MIH)  
+- Eemeli Aro \- Mozilla (EAO)  
+- Tim Chevalier \- Igalia (TIM)  
+- Elango Cheran \- Google (ECH)  
+- Mark Davis \- Google (MED)  
+- Richard Gibson \- OpenJSF (RGN)  
+- Matt Radbourne \- Bloomberg (MRR)
+
+
+**Scribe:** TIM  
+
+## [**Agenda**](https://github.com/unicode-org/message-format-wg/wiki#agenda)
+
+## Topic: Info Share
+
+EAO: JS implementation is up to date
+
+### Balloting results
+
+*Call to submit MessageFormat 2.0 for balloting by the working group. This is the last chance to discuss if we have accomplished our goals.*
+
+APP: 10 ballots, all approved. We have approved it for submission to CLDR TC. I see from Shane’s voluminous notes that the machinery is in place to start examining it there. I’ve made some personal replies to his comments; EAO, you had 1 or 2 and I’ve done some chair housekeeping. We’ll see how that plays out.
+
+MED: I also had an issue, I’ll paste it in [https://github.com/unicode-org/message-format-wg/issues/958](https://github.com/unicode-org/message-format-wg/issues/958)   
+– the result of some discussions about bidi. 
+
+APP: I see you’ve seen Eemeli’s comment on that, I haven’t read your response yet.
+
+MED: Can talk about it when you get to it
+
+APP: Any expectations on CLDR stuff?
+
+MED: I will take it into TC meeting; we’ve already said people should be reviewing it. It’s unanimous consensus of the WG that it be recommended for 46.1.
+
+APP: After today’s call, I will proceed to create a release in our repo, and we can work towards an HTML version of the spec as well. For now I’ll mark what we finished today as a 46.1 candidate.
+
+MED: Of course if there are small things, typos, formatting, non-material wording, etc., that can happen after the ballot and after it’s approved by the TC.
+
+MED: Talking about it the first week of December.
+
+EAO: As a forward-looking question, what is the release after the one that’s we’ve just balloted that we ought to be working towards? 47 or 48?
+
+MED and APP: It’s 47
+
+APP: We already have a label in the repo for 47\. That would be presumably our final, activating the stability policy and exiting tech preview.
+
+MED: The plan is for ICU to hit a draft status in release 77\. In order to be ready for 47, it has to hit the spec beta and the spec beta is mid-February. 
+
+EAO: I was wondering, what is the extent of changes that we ought to be even considering for implementation, based on the current one up until 47? Presumably we want this thing to be in the shape that we’re saying it’s fine for 47\. So should we actively limit the scope of what we could even consider as changes for 47? Not talking about later times, just 47 because we need to go final.
+
+APP: My proposal would be that we behave as if the stability policy were active, not consider material changes to things we think should be stabilized unless there’s a really strong reason. Listen to feedback we get, but will try to act as if we were serious and this is done. See how that holds up. Would also suggest we operate the function registry because I think that’s the place we’ll see the most proposals. Start to use mechanism for proposed vs. accepted changes. So not accept anything until 47\. The last thing I’ll note is that we have action items post-46.1 to ask important constituencies – TAG, ECMA, and another one – for comments, and of course listen to their comments.
+
+EAO: Noting that we have a near future meta-task of explicitly accepting the proposal process, because that’s still a proposed thing itself. So it’s a thing we agreed to do and not a proposed idea.
+
+APP: Need structure around how we handle those things. 
+
+MED: I would expect to have only – depending on the feedback, I’d expect any whole-cloth new functions be optional, because they are not subject to the stability policy.
+
+APP: Two things – and we have examples, the time zone option is proposed required and the unit function is proposed recommended – and so proposed things are not yet finalized.
+
+MED: I was using the wrong word. I would expect any new functions to be proposed and so not subject to the stability policy.
+
+APP: Right. And as we accept things, they would be either required or recommended. My guess is we will rarely put things in “required” because every implementation must be able to do whatever that thing is. Even low-capacity environments with more serious limitations. We might see something – `math` I think is required, we might do more things with `math`, more features like that.
+
+EAO: Sounds like this is a process that would be good to coordinate more closely with CLDR-TC. Seems from a reader’s point of view that whatever the process is for the MF parts of the LDML matches whatever is used in other parts of LDML.
+
+MED: We have and will continue to make changes in CLDR based on feedback from ICU, ICU4X, MF, and it might go the other way as well. We need to make sure that you know if CLDR is adding new capabilities, just to make sure you have FYIs. 
+
+APP: I think the next thing that would happen would be the ICU folks would look over their list of formatters and make proposals. There’s a range of those out there that want to be more than `icu:something`.
+
+MED: That sounds good. We’ll need to manage – look at how we take input, as well. That’s a topic for later.
+
+### Topic: Unit Formatting (#922)
+
+*Last week we discussed taking :unit as optional if our work was done. Propose merging it.*
+
+APP: It is proposed recommended, so would not be under the stability policy. Not required.
+
+EAO: I am ok with this going into 46.1 and probably 47 as proposed rather than required. I think there’s a bunch of text around that that still needs working on and discussion. As proposed, I’m fine with it because it requires nothing from nobody.
+
+APP: Are you sufficiently comfortable with it to say that we want to end up with a `unit` function and the changes we want to make are minor?
+
+EAO: Depends on your definition of minor. The discussion of what to support re: the units and usage options could look minor or could look major depending on how you look at it. I’m not sure where I stand, so I’m not willing to commit to getting it further than proposed for 47\.
+
+MED: 47 we don’t need to talk about it, the question is whether it’s proposed recommended for 46.1. I’ve not heard anyone speak against it; I recommend we put it in.
+
+APP: Does anyone object to my putting it in?
+
+(No objections)
+
+APP: Then I will make it our last addition
+
+EAO: Do we have language at the top of registry saying everything is required except when it says it’s recommended?
+
+APP: We have language like that. Let me find it. Section describes required functions, which are required, along with recommended functions that should be implemented. Don’t say anything about proposed – should I add something?
+
+EAO: No, that’s a later discussion, the acceptance of the process around this.
+
+APP: Both timezone and unit, I was careful to write the text for those in a suppositional way. “The function `unit` is proposed to be a recommended formatter and selector for unitized values.” It’s not “is a” but “is proposed to be”. It should be clear to readers that this is a not-done thing.
+
+APP: Should we squash? 
+
+## **Topic: PR Review**
+
+*Timeboxed review of items ready for merge.*
+
+| PR | Description | Recommendation |
+| ----- | ----- | ----- |
+| #923 | Test schema: allow src property to be either a string or array of strings | Discuss |
+| #922 | Implement :unit as Proposed RECOMMENDED in the registry | Merge |
+
+## Topic: Issue review
+
+[https://github.com/unicode-org/message-format-wg/issues](https://github.com/unicode-org/message-format-wg/issues)
+
+Currently we have 31 open (was 31 last time).
+
+* 5 are tagged for 46.1 (3 are blocker-candidate)  
+* 13 are tagged for 47  
+* 4 are tagged “Seek-Feedback-in-Preview”  
+* 6 are tagged “Future”  
+* 8 are `Preview-Feedback`  
+* 7 are `resolve-candidate` and proposed for close.  
+* 0 are `Agenda+` and proposed for discussion.  
+* 1 is a ballot
+
+### Issue 958 (bidi)
+
+MED: Two separate points. One that Eemeli replied to. One is about the HTML option and literal text. Main point is an implementation should be able to have options for those. Make sure the spec doesn’t exclude it. Want to make sure everyone’s reading of the spec does not exclude the ability to have an MF2 API that has options for those two things. One is generating HTML instead of the bidi controls, and the second is to be able to fiddle with the literal text in a pattern to make adjustments like “a” to “an”.
+
+EAO: My sense here is that we’re talking about beyond the scope of the spec contents. In the MF2 spec we’ve written so far, we say that implementations should provide outputs in ways that serve a user. But something to the effect that should allow something like HTML to be an output format directly. But also should allow for an intermediate output format like the formattedParts of JavaScript that can then be used to construct HTML as an output format.
+
+MED: These can actually be an option on the MF API for an implementation and one option would say “sure, you can muck with the literal text to make it grammatical” and the other would be “sure, instead of bidi controls you can emit HTML markup.”
+
+EAO: I think we allow for both. We have a requirement for, if you do string output, you have to provide a couple of capabilities. We do not set any upper or outer limits for what a message formatter could be outputting.
+
+APP: We don’t define other forms, but we do define the strict string. From that perspective, emitting HTML is a possibility, providing support for markup, providing a formatToParts capability which we do not define is permitted, could result in all sorts of these things. All we say is if you’re emitting a sequence of Unicode codepoints with nothing else, this is what you minimally must provide. Allow you to provide other bidi handling algorithms should you so desire, as long as you provide this one.
+
+MED: We were discussing in the ICU meeting, when we say it’s the standard bidi algorithm, do we mean that’s equivalent to the standard bidi algorithm – that is, we ensure the whole thing would go through the bidi algorithm under any permutation of placeholders with the same ordering as if you applied the bidi algorithm to the standard format. Question is whether or not we had to have a separate option for that.
+
+EAO: My understanding and intent with the language around this is that you would be required to provide some way of getting exactly the characters defined by the bidi algorithm. There’s one loophole in that we do not explicitly define how you define the directionality of a placeholder’s resolved value, which you could automatically detect from the contents, which could mean that a `:number` in one implementation would have its direction as `auto`, and another as `ltr`, given the same message. This would lead to a difference in the formatted string output. Not a hole I think we should be trying to plug.
+
+MED: Eemeli, you said it’s intentional that the default bidi algorithm doesn’t allow for optimizations as it’s meant to provide the same output in different implementations. I think that’s a stretch, b/c nothing about MF requires it to produce the same output in different implementations. We’ve said you have the freedom to screw around with options.
+
+EAO: The bidi algorithm is part of the implementation and it’s possible on different platforms to implement the same function handler that would give you the same behavior. The bidi algorithm is not something that a function handler provides. So we should ensure it does not create a difference in output. Suggesting that implementations provide the same output given same input, but not required we do so in all cases.
+
+APP: The other statement in your request is inflection, not bidi. I think we do not allow pattern parts that are literals to be modified in the course of formatting. Formatting says “emit them”. As Eemeli says, you could imagine a higher-level process than inflecting the results of that, but we don’t currently permit it because we don’t have an inflection feature at the moment.
+
+MED: What’s the difference between calling MF2 implementation with an input parameter that says “go ahead and fix problems in the literal text” vs. one that doesn’t have that feature, plus a wrapper everyone has to call… it doesn’t make any sense to me.
+
+APP: I can imagine creating some formatter or selector to do that in the future. Not currently in our spec. I’m calling out that the spec says to emit the literal. We aren’t necessarily precluding the future addition of an inflector. But at the moment I think that might not be the case. I don’t see that as related to bidi.
+
+MED: It’s related, but it’s a second issue. I’ll make a second issue for it.
+
+MIH: We cannot really get the same result, among others because the formatter functions can’t be guaranteed to give same results. That also includes bidi. The CLDR data for some locales do sprinkle bidi inside the dates, units etc.
+
+APP: Those are inside isolated –
+
+MIH: They’re inside the string generated by the date formatter. So not under control of our date formatter functions.
+
+APP: There should be nothing wrong with that
+
+EAO: Let’s consider the JS and ICU implementations and what I was trying to point out is, if these two differ in the formatter behavior, it’s possible to implement in JS an implementation of the standard formatters that exactly matches the behavior of a very specific version of ICU and use it instead of the built-in JS functions. Possible to have two implementations that give the exact same output.
+
+MIH: We probably can’t resolve it now, I can share an example of where Mark is coming from; we have a colleague who wrote a wrapper for placeholders to look inside the final string value of something and magically wrap it and all that. There was a lot of pushback on providing too many extras. Android bidi wrapper – the one in Android even has strategies you can set to determine if something is bidi. I think we got feedback that people don’t like a lot of extra bidi characters if they are not needed. So if there’s an Arabic message and the placeholder is all Arabic, why put in extra bidi controls saying it’s RTL?
+
+APP: Because there’s plenty of neutrals and so on that cause problems. And even things that detect a strongly LTR which aren’t. I have a whole panoply of examples that I mostly inflicted on Eemeli that demonstrate why any kind of evaluation based strictly on the character sequence can produce the wrong results. We mostly have a pretty good thing, produce the right results in bidirectional situations, and we have a couple of outs to reduce the number of bidi controls in purely LTR messages that are in LTR locales. This is about as good as I think can be done. Can permit other things to be implemented.
+
+EAO: Effectively this discussion is a lot about the formatting chapter of the formatting section of the spec. Looking at it now, I don’t think we strictly limit the sort of interpolation Mark was talking about. We don’t necessarily consider it, but it’s not explicitly or strongly forbidden in the text. We do give as a non-normative example, we say a formatter in a web browser could format a message as a DOM fragment rather than a representation of its HTML source. Re: The bidi and not wanting that, we do have this sentence: implementations SHOULD encourage users to consider a formatted localized string as an opaque data structure suitable only for presentation. Which in part to answer complaints about that being an opaque blob containing bidi controls. We are allowing an implementation to provide output that doesn’t have them, but also provide the default bidi strategy:
+
+APP: If we did our job right, you should never notice that we’re doing this. 
+
+APP: I don’t think there’s any action from that. Is there anything else we want to do with it?
+
+## **Topic: AOB?**
+
+APP: I don’t see any burning issues for us
+
+EAO: We need an explainer.
+
+APP: Do we want to put it in our repo, use Luca’s thing, or…
+
+EAO: Luca’s thing is documentation. I think we need something that could theoretically fit in one page and be a README or something close to it, so someone who knows nothing about this could quickly get an idea fo what and why.
+
+APP: We need an explainer and need to start a FAQ. I learned long ago to never write the same thing twice. I think that’s not spec. Are you volunteering?
+
+EAO: Sure.
+
+APP: I’m going to branch 46.1 and call it a release. An explainer will come, but I would consider it post-46.1 activity between now and 47\. 
+
+EAO: I think it’s a bit of a requirement for us to get reviews properly beyond just CLDR-TC.
+
+APP: Yes, in order to request from TAG, I have to have an explainer. One of the action items is to request that. Also ICU TC, and the third one is TC39. Do you all want to have a meeting next week? We won’t have any news from CLDR TC then, because their meeting doesn’t occur until the following Wednesday.
+
+EAO: I’d be fine with us skipping a week.
+
+APP: Next call will be the 9th. 23 is the week of Christmas, and then the 6th go to bi-weekly? Or keep weekly for now?
+
+EAO: I would say 9, 16, and then 6th of January or something.
+
+APP: I will make that our schedule.
+
+EAO: Question for Tim and Mihai, do you have plans on when to update the ICU implementations.
+
+MIH: In progress. I’m currently on vacation, but after that. When it’s ready, it’s ready.  
+ For certain things we need API approvals, design docs, design doc approvals, etc. 
+
+APP: I don’t recall if there’s going to be an interim release of ICU.
+
+MIH: No; it’s a long heavy process. Mark and Markus were saying it’s going to be 46.1 LDML, and for ICU they will put a timestamp or a tag on github saying “if you want to play with MF2, go to this tag and build it.”
+
+TIM: I have most of the updates implemented in PRs waiting for review. Some things like currency formatting are waiting for PRs to land before implementing. And like MIH said, some things need design docs, which I have and they’re waiting for review.
+
+EAO: Is there a spec yet for non-string output?
+
+MIH: Right now Java doesn’t do formatToParts. What formatters produce right now is very clunky and ugly, esp. in Java. If you call a DateFormat and you want a formatted thing so I know where the parts of the date are, it’s an ugly thing with an iterator that comes from Java 1.1. Every time I try to use it I spend a lot of time trying to understand what’s going on. So I would like to redesign that parts before building more stuff on top of it. I want to be able to say “give me the month part of the thing you just formatted”, etc., including overlapping ranges. We’ll see if the ICU TC approves that. If they don’t approve, what might happen in 77 is generating the same kind of iterator that all formatters return, and then we’ll have to deal with deprecating it and what-not.
+
+EAO: Another related thing for you to think about is that by default, we strip out all of the markup, so if you’re only going to output a string, outputting something XML-ish might be useful to some users who don’t want string output.