Replies: 21 comments 20 replies
-
I think there's a difference between the data model extensibility and the runtime model extensibility. Looking at your implementation, I think you're proposing both? export interface Function extends PatternElement {
type: 'function';
func: string;
args: (Literal | Variable)[];
options?: Record<string, Literal | Variable>;
} export const formatter: PatternFormatter = {
type: 'function',
formatAsPart: formatFunctionAsPart,
formatAsString: formatFunctionAsString,
formatAsValue: formatFunctionAsValue,
initContext: mf => mf.runtime
}; I agree about the runtime extensibility. I'm cautious about the data model one. My hope was that the data model can be stable to increase its portability. Ideally, messages should be understandable by all tools in the pipeline. If the user introduces a custom extension of the data model (e.g. adds a new |
Beta Was this translation helpful? Give feedback.
-
This is a good summary. It's interesting to consider which elements can be expressed through other, more atomic elements. For example, I replaced In fact, my proposal started out with even variables being represented by a built-in functions: This to me is a good example of how we can enforce rules and form expectations about the data model by introducing separate data types. I wouldn't want us to shy away from using this design tool. |
Beta Was this translation helpful? Give feedback.
-
In my mind (and in the EM model) the placeholder (which invokes a function) also handles "variables" and "terms" (which are actually message references), and "elements" Variables Variable without a formatting function does not make much sense (if the variable is a date then it needs formatting). Elements The "elements" used for formatting also need functions. For example When you "render" this message in a GUI environment (let's say a html widget) the "render" function would generate That way you can reuse the message in various environments and platforms. So elements also need a function. Message refs (terms) And you need a function to load the referred messages from various places (resource bundles, database, etc.) That's why variable / term / element are all bundled under "placeholder" in the EM model. |
Beta Was this translation helpful? Give feedback.
-
I'm not sure I fully understand what @eemeli is proposing here, if this is meant to be a mental model for how we work, or something reflected in the published spec. I agree with @stasm that looking at the proposals in terms of elements is a nice and tidy way of summarizing the models. As a mental model, thinking of elements as extensions, developing them independently, and then considering them for inclusion in the data model seems like a nice way of breaking our impasse and moving forward. If we're all in agreement that I think it would be a failure of the group if we publish a spec where almost everything in the data model is an extension. But as a conceptual model for how we could work, it seems like it would let us first focus on areas of agreement, and then have a process for how to build on those areas of agreement. |
Beta Was this translation helpful? Give feedback.
-
I actually favor this approach, and here is why:
In a way that makes sure (and proves) that all functions are "first class citizens" |
Beta Was this translation helpful? Give feedback.
-
I find this idea weird, and if someone would come with a code PR like that for review I would reject it.
There is no "function" that we know of that can be used both for formatting ( They should really be 2 different interfaces (traits). |
Beta Was this translation helpful? Give feedback.
-
To understand (and properly decide) what the functions look like we really need to decide what the runtime looks like, and how the rendering happens. There are a few relevant section in this "whiteboard doc": Looks like about half of the "friction points" are around functions: And I think that a lot of these are caused a bit less by "raw disagreements", but more by under-defined terms. |
Beta Was this translation helpful? Give feedback.
-
TLDR: I think that until we define things the discussion is premature. |
Beta Was this translation helpful? Give feedback.
-
I am kind of reluctant to duplicate a lot of the content that is already in the "whiteboard doc" I that that there are 3 areas that need clarification in order to decide what functions look like: 1. What they look like in "the registry" More precise, the "registry schema" or the file that will accompany the MF2 standard. Think of it as a header file in C / C++ At this level the info would look something like this:
The options part is less relevant here. The possible inputs are unclear. 2. What does the model look like when parsed This might be different than the model ready to use when rendering. There is no need to have real implementations for the various functions. Similar to what an editor needs to know about HTML. So:
At this level what info needs to be present? With the examples above probably something like this:
Rendered as "Your card issued on Feb 2019, must be payed by September 30, 2021, or you will be charged late fees." And
Rendered: "The XYZ Conference moved from Oct 2-5 to Oct 9-12" 3. What the model looks like at runtime, and how it the rendering done At this point we need real implementations for the various functions, and we also need to "bring in" somehow the various inputs.
What is the type of To render the first message we would need to somehow connect: At this point we need function implementations, and real types. We want to be able to use the same serialized message, parse, then render, on Win, MacOS, gettext, Android, browser, etc. TLDR: looking at the examples above, and following the full "lifecycle", I think it is pretty clear that: A. The data-model should stop at "parse level" (opinion) B. We need a description of the rendering (runtime?) behavior C. The model can't specify they types of the inputs for the functions. Something like "Date" is platform / framework specific. But I think that the good news is: we don't need to (opinion) D. Would be "unclean" for the model to force on all platforms that "functions" must be class-like at rendering time I should be free to use function pointers / references, lambdas, something else, whatever. It's an implementation detail. We only need this at rendering time. As an implementer I should be free to implement a runtime registry as one single map from string to classes with It's an implementation detail, not a data model issue. E. A simple |
Beta Was this translation helpful? Give feedback.
-
I am at this point not pushing for the EM model, or against any model. But I want to explain how the EM model answered the problems described above: 1. The "schema" / registry All functions only take one parameter, of type There is no need for timeStamp / instant / date / calendar named inputs. That is resolved in the implementation, by looking at the runtime-type info (RTTI) So in the EM model it is not possible to pass As such the inputs don't need names, it's only one parameter:
Since all functions take on input and one input only, there is no real need to list it in the registry 2. Parse time The examples look 100% like described before:
The 3. Rendering time The parameters passed to So the implementation is something like this (pseudo-code):
Output is string for The selector functions are called completely differently, yet another reason to not "dump" them together with the formatter ones. The "map function name to real callable" part is solved by the "runtime registry" In term of interfaces: So a developer wold register functions like this: Real implementation:
Or (yes) one can wrap all 3 kinds of functions in one single object. I personally don't think I like the idea, and I would not do it. But as long as this is not visible in the data-model any implementer can do whatever they want. And in the EM model this is not visible. |
Beta Was this translation helpful? Give feedback.
-
First, let me clarify what I'm trying here: I am trying to understand what the proposal is. I am not arguing (yet?) against or pro some features.
Yes, I think it would be enough. But I am open to listen.
I would expect that all functions will need to have some kind of "context". Basically "the implementation captures in a context everything that is needed for the implementation to work" I think that it is not really possible to define what the context contains. Or the function would have access to the parent message, which has a locale. No need for that to be in the context. For example to load strings using the framework that uses MF2 one would need access to native functionality. Stuff like LoadString on Windows, or a Resources.getString on Android, or NSLocalizedString on MacOS. Can we force them to use one locale and one locale only? At least for now I see these "functions" the same way Java (and other languages) see interfaces: Whenever an interface definition forces the implementer to write some dummy function, or pass a null parameter,
You are right, that is not well described. I'll try to say it differently: there would be one model the is the output of the parser. Should we stop to only specify the result of the parse model?
Is that the same thing as a Map?
100% agree. Am I only confused by the model (as proposed by Stas) defining all kind of types (string, decimal, boolean, etc). And are we still talking the Stas model, or are you proposing a forth model? Because the RuntimeValue there seems quite confusing to me. It is used both as return of a formatting function:
How can I represent "format to parts" with the few basic types defined? And for Or RuntimeValue only means some "unknown type packed with type info"? If that is all there is, do we even need it, if the programming language I'm using has RTTI?
Yes.
How do we get to them "at render time?" Because I don't see explained anywhere how things are rendered.
Basically I don't think it is clear how functions and "variables" (or whatever the name is now) come together, and when.
No.
Sure, we agree on this.
I think that only makes things "clunky" It is also less work when I try to render (format) the message. Worse, we might need a RuntimeDateValue. RuntimeCalendarValue, RuntimeInstantValue We should not burden the spec (and the devs) with extra abstractions , unless they are really needed.
I don't see how.
Also agreed. It is hard to NOT agree when the statement contains "when it makes sense"
If the arguments come as an array (as proposed) there it no way to extend things. All the functions that take 2 RuntimeValues are "merged" into one. For example, if we have a formatRange( int startIndex, int endIndex ) in this data model they will be an array with 2 IntegerLiterals. That's why people like named parameters.
So who own the methods? The RuntimeValue? Is there some implied decision already (that we should make explicit) that we do:
Or we do
or
or maybe something else? (I have a factory, for example) But the core question here is: The code seems to indicate functional. Sorry, looking through it again, and it feels very argumentative. It's just that I don't see how things come together. The main reason is not "people can't read code" (of course they can) For example in the Stas's code there is stasm\third\impl\context.ts And at the end the What happens if I add "DateRuntimeValue"? I think that proof of concept implementation should simulate the "real world": I get ICUNext, or FluentNext, or ECMAScript.Intl.MessageFormat, and WITHOUT changing the code I add my own types to format, or to select on. |
Beta Was this translation helpful? Give feedback.
-
P.S. Don't feel pressured to answer over the week-end (the way I did). Even if you do, we can't expect all the group members to "parse" this thread and get an informed opinion. |
Beta Was this translation helpful? Give feedback.
-
Mostly because that description is too vague to discuss without clarifying that that means. What does The most critical missing piece is what us "known" after a parse, and what is know at "render time" Does it mean that the If that's the case, then there is no need for a function, to format a list, just write "foo, bar or baz" So if you don't want to clarify it here, I hope they are clarified in the presentation to the group. |
Beta Was this translation helpful? Give feedback.
-
Quoting from that exact block:
For me the part that is unclear is the That's what I try to understand. In my mind there has to be "something like that", even in Fluent. So, what's it "the thing" that brings This is what I tried to clarify in the "7. Variable references" section of the "whiteboard doc" I had with Stas. All that affects directly the function signatures. |
Beta Was this translation helpful? Give feedback.
-
I don't know. It is your proposal, and I am trying to understand it.
I'll try to map things to a different set of concepts, to see if I get it: Would it be fair to say that "FormattableDateTime" is a formatter in ICU meaning? And would that make So there is another "secret function" that takes something like
and takes the current value of
Is that the case? If yes, then I have to think about the implications. But before spinning my wheels in the wrong direction I would like to understand if this is the case, and where the value of that |
Beta Was this translation helpful? Give feedback.
-
Absolutely. But when judging APIs I usually prefer getting in the shoes of a random developer who uses that API directly, usually only reading the documentation when they get stuck. So what I want here is to clarify how things are expected to behave.
I have the nagging feeling that we talk about the same thing, but somehow we don't understand each other. So I keep rephrasing the same question in all kind of ways, to maybe click at some point... Here is another way for me to ask it... Does When I parse the syntax (whatever that is) and I have a model, we end up with a If the syntax was "...{dueDate, DATE, ::yMMMd}...", what would the PatternElement[] look like? Can I do
and get different results? That means that the value to be formatted is not part of the Or not? So the
OK, in my answer want to say anything about Fluent of the designers of Fluent. So, having the designers available: very-very-very valuable. The value is added when the original developers answer questions, not send us to existing docs or code. I value the experience of someone who used a system for a long time over the person who designed it. The power users can tell you why some of the choices were bad and some of the choices are good. If I design something I will think it is the best thing since sliced bread. But if the power user that has experience with my tool and 20 other tools tells me that I could have done better, I should listen. And adoption is not a criteria. ICU has very high adoption as a library. Python 2 adoption is huge. But we still have Python 3. |
Beta Was this translation helpful? Give feedback.
-
I think I am starting to (slowly?) understand... So there is a model after parse, that does not contain any But then at some later point there is a model where (some of?) the "functions" are replaced by Is this acurate? If this is the case, then these are the areas that I think are getting in the way of understanding: There are in fact 2 models:
We can call these 2 models, or the same model transformed, whatever, tbd This is not explicit anywhere. And I think that trying to describe 2 models (or 1 model with transformations in time) is what makes things confusing. So the runtime workflow is something like this?
I called the Because at least for me this was not at all clear (still not sure that's the case) This is what I asked in the "whiteboard document", the "9. Various “states” for in which the model can be" section. I think reading that document might help reduce all this back and forth, as I was trying there to understand the model proposed by Stas, which is what I thing all of these PRs are trying to implement. I think that the "naming" also gets in the way of understanding, but let's take it one at the time... |
Beta Was this translation helpful? Give feedback.
-
I am sorry, it looks like my answer about "invaluable" did upset you. So I will just "walk away" from that discussion, to not make it worse. If you really want to hear my take on it ask, and I will try to put it as neutral as possible. maybe not in this github discussion. And it is all my personal opinion. Personal opinions can be of course wrong. It's not science. |
Beta Was this translation helpful? Give feedback.
-
I was thinking that this is what you describe as "function" (things that are registered, like "DATE") ====
It is kind of a different state of the model. But they are conceptually different. My main question then is: if this "second state" exists only inside the format call, why expose it? I is clearly possible to implement format / formatToParts without going through this state. |
Beta Was this translation helpful? Give feedback.
-
I am perfectly happy with that, and I would like to see if we can agree on the implications. For example this means:
What is important is the signatures of the 3 methods, and describing what they do. We specify the methods exposed by an implementation of the "runtime registry" and that's good enough
100% to that. But in principle I agree that a reduced set of "core types" is the way to go.
Maybe I don't understand some of the explanations in this discussion. So if there is no reason to represent something as
There is a PR adding it to the spec: #198 This is the reason I use "fuzzy lingo" when I talk about implementation details, and I put things in quotes, So that nobody confuses possible implementation with spec. |
Beta Was this translation helpful? Give feedback.
-
This is a very lengthy discussion. So I may be missing part of the discussion. I like the fact that skeleton has been mentioned for dates. An explicit date is probably not a good idea. Though for our use case of grammatically correct sentences we have a modified relative date, which is a little harder to do, but it goes to show a generic problem with supporting prepositions. Here are 2 example scenarios that I'd like to see supported with this MessageFormat. These are just 2 scenarios that we frequently struggle with. Dates with a preposition
Notice that the "on" preposition is sometimes there. In some languages, the preposition "at" varies depending on whether it's 1 or not 1 for the time. What's more interesting is that some languages (i.e. Brazilian Portuguese) can format the time as 13:00 and then verbally say "1 in the afternoon" for the same time, which is confusing if the preposition is written one way and the entire phrase is spoken differently. I'm unsure if CLDR can fully support this scenario. Locations with a preposition
In a language like French, the rules are a little more complicated for choosing the right preposition. Can we handle such situations with the current proposals? I'm not saying that it needs to be solved, but the infrastructure should be able to handle them. |
Beta Was this translation helpful? Give feedback.
-
tl;dr We should split up the spec, making variables, functions, terms and elements extensions of the core.
Recent work with and discussions about the MF2 spec has lead me to realise that there's a somewhat obvious level at which the spec may be made extensible: Pattern elements. So far, our discussions have identified at least five possible types:
literal
: immediately defined valuesvariable
: values defined at runtimefunction
: placeholders or formatting functionsterm
: including messages within other messageselement
: formatting and styling elementsOur discussions of the data model have thus far been based on the premise that some explicit monolithic set of such elements is the correct answer, but we don't really agree on which:
literal
,function
literal
,variable
,function
,term
literal
,variable
,function
Rather than trying to resolve these various schools of thought into one, we should agree to disagree, and build the spec so that each element is defined as an extensions of the core spec. By building an example implementation of such a modular system (see
src/pattern/
for implementations), I've determined that an appropriate interface for such a pattern element formatter might look like this:The scope of that interface also indicates how these pattern elements are different from the formatting functions available via
function
: They require more complex setup, and have a more complex API. Essentially, these two layers answer rather different questions:function
) should be relatively simple to implement. For example, the functions required for MF1 and Fluent compatibility are all 1-6 lines of code each. They're allowed to throw errors, and return a single value.For example,
term
is somewhat similar tofunction
in that it enables for relatively simple ways to define a value that may be embedded within a message, but leaves its definition controlled by the localiser or translator rather than a programmer.Adopting a modular approach as presented here should make it easier to focus on and agree to individual parts of the data model (as we recently did on the interface of
function
), as well as providing a way for external interfaces to communicate requirements and expectations about what they support. For instance, MessageFormat 1 compatibility requiresliteral
,variable
,function
, while Fluent also needsterm
. Similarly, a translation provider could e.g. claim completefunction
configurability, while needing to processelement
as pass-through values.Beta Was this translation helpful? Give feedback.
All reactions