You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I propose that an extra annotation be available in Pegex grammars, so that the return result from an appropriate receiver class, possibly the default one, be a sensible Abstract Syntax Tree (AST).
{
"name": "the variable name",
"children": [
# whatever the value and variablename productions returned
]
}
There are two additions above to Pegex as-is:
the @ means do not make a "child" node, but instead a hash-entry (like an XML attribute)
the - means include the annotated entity directly, rather than making an extra level of child
Details need to be figured out, such as the actual data-shape in JSON terms. A slight challenge is that this makes Pegex return effectively an XML document, which would probably only require each node be a hash with up to three keys: type, attributes, children.
attributes would be a hash, obviously
type would be a string, and would be the rulename in this situation
children would be an array of nodes
This need not interfere with Pegex receiver classes as currently made. The benefit of this additional feature would be that with the grammar, a simple call to Pegex with the input text would return an AST, without having to write a receiver class at all. This would make parsing completely language-independent.
This seems like it would be useful not just in parsing a language into an AST, but in any parsing situation, since I believe they can all be considered as a transformation of creating an AST, then using it. (I'm aware this doesn't deal with the choice of a tree versus a stream of parsing events, but since XML itself can be operated on as such a stream, that doesn't seem fatal)
The text was updated successfully, but these errors were encountered:
@ingydotnet points out that +(...) may help out here. Also having just reminded myself about Pegex syntax, -(...) already exists, and may already do what is under discussion here.
I propose that an extra annotation be available in Pegex grammars, so that the return result from an appropriate receiver class, possibly the default one, be a sensible Abstract Syntax Tree (AST).
This is based on "invisible XML" ideas by Steven Pemberton: https://homepages.cwi.nl/~steven/ixml/
This would operate such that this rule:
would produce, on matching, this data structure:
There are two additions above to Pegex as-is:
@
means do not make a "child" node, but instead a hash-entry (like an XML attribute)-
means include the annotated entity directly, rather than making an extra level of childDetails need to be figured out, such as the actual data-shape in JSON terms. A slight challenge is that this makes Pegex return effectively an XML document, which would probably only require each node be a hash with up to three keys:
type
,attributes
,children
.attributes
would be a hash, obviouslytype
would be a string, and would be the rulename in this situationchildren
would be an array of nodesThis need not interfere with Pegex receiver classes as currently made. The benefit of this additional feature would be that with the grammar, a simple call to Pegex with the input text would return an AST, without having to write a receiver class at all. This would make parsing completely language-independent.
This seems like it would be useful not just in parsing a language into an AST, but in any parsing situation, since I believe they can all be considered as a transformation of creating an AST, then using it. (I'm aware this doesn't deal with the choice of a tree versus a stream of parsing events, but since XML itself can be operated on as such a stream, that doesn't seem fatal)
The text was updated successfully, but these errors were encountered: