The goal of the Semantic UAST is to provide a set of UAST node types with a strictly defined semantic meaning that does not depend on the programming language.
Semantic UAST types are defined in the Babelfish SDK on top of the schema-less representation.
The @type
field in Object nodes is used to determine an exact type in
the Semantic UAST type system. Besides the Semantic UAST types, drivers
may emit language-dependant node types that were not yet covered by
Semantic UAST concepts.
UAST node types can have a namespace similar to XML namespaces.
For example, Java AST defines a Identifier
node type, while Go AST defines
a similar type called Ident
, and the Semantic UAST has it's own
concept called Identifier
.
To distinguish between these node types the lang:
prefix is added to
each type, and uast:
prefix is added for types defined by Semantic UAST.
The prefix without the :
is called a namespace.
For our example, types listed above will be written in the following form
when adding namespaces:
java:Identifier
, go:Ident
, uast:Identifier
.
As described in the schema-less representation
spec, object fields starting with @
are considered internal and may be
present on any object regardless of the type (schema).
This UAST specification defines few more special fields:
-
@pos
- stores the positional information related to this UAST node with aPositions
node which in turn will havestart
andend
nodes of typePosition
. See thePositions
type below for more details. -
@token
- a text representation of this node in the source file. This field is only available for compatibility reasons. If available,@pos
should be used to get the source code corresponding to the UAST node. -
@role
- stores an array with role codes. This field can be used to interpret native AST types that were not yet covered by Semantic UAST.
All other field are defined by the Semantic UAST schema.
Types are defined in the SDK. In case of doubt use that source file as reference.
Object that stores all positional information for a node. This node kind
@type: uast:Positions
Field | Type | Description |
---|---|---|
start |
uast:Position |
Start position of the node. |
end |
uast:Position |
End position of the node. |
* |
uast:Position |
Any number of custom positional fields. |
Keys of this object can be arbitrary names for positional fields of the
UAST node. Only two fields are defined: start
and end
to allow users
to access source snippet related to the node.
As an example of a custom positional information, a ternary operator
x ? y : z
node may store individual positions for ?
and :
characters
as a separate then
and else
fields in Positions
node. This field will
always be in the parent node under the @pos
property.
Represents a position in a source code file. Cannot have any fields except
ones defined below. Belong to a Positions
parent node.
@type: uast:Position
Field | Type | Description |
---|---|---|
offset |
Uint |
Position as an absolute byte offset (0-based index). |
line |
Uint |
Line number (1-based index). |
col |
Uint |
Column number (1-based index). The byte offset of the position relative to a line. |
Identifier is a name for an entity. The name could be any valid UTF8 string.
@type: uast:Identifier
Field | Type | Description |
---|---|---|
Name |
String |
An identifier name. |
A UTF-8 string literal. Format
parameter is a driver-specific string
format that was used for the literal in the source file.
@type: uast:String
Field | Type | Description |
---|---|---|
Value |
String |
An unescaped and unquoted UTF-8 string value. |
Format |
String |
Driver-specific format that was used for the literal in the source file. |
Qualified name consists of multiple identifiers organized in a hierarchy. Identifiers are stored starting from the root level of hierarchy to the leaf. The closest analogy is the filesystem path.
@type: uast:QualifiedIdentifier
Field | Type | Description |
---|---|---|
Names |
[]uast:Identifier |
A path elements starting from the root of the hierarchy to the leaf. |
Comments can span any number of lines. Block
flag indicates that the
comment uses block syntax (/* ... */
in Go) instead of line-comment
syntax (//
in Go).
Comments might have a prefix and suffix for the whole comment, and each
comment line may also be prefixed with a Tab
to express a following pattern:
/*
* This is a multiline
* block comment
*/
In this case the Prefix
and Suffix
will be set to "\n"
, and
Tab
would be set to "* "
.
@type: uast:Comment
Field | Type | Description |
---|---|---|
Text |
String |
An unescaped comment text (UTF-8). |
Prefix |
String |
A prefix added to the first line of the comment. |
Suffix |
String |
A suffix added to the last line of the comment. |
Tab |
String |
A prefix added to each line of the comment. |
Block |
Bool |
If the comment is a multi-line comment. |
Block groups multiple statements and enforces sequential execution of these statements.
Eventually, blocks will also include a reference to a scope if it defines one.
@type: uast:Block
Field | Type | Description |
---|---|---|
Statements |
[]Node |
An ordered list of statements. |
Aliases provide a way to assign a name to an entity or give it an alternative name in a specific scope. An alias acts like an immutable alias for an object. The only way to reassign the name used by an alias in a specific scope is to shadow it in a new child scope.
Alias should contain a reference to the scope where a name should be defined. But since scopes are not be covered by the current spec, an actual definition of this relation will be specified in the future.
Examples of aliases are names for types, constants, functions, local names for imports, local names for imported symbols, etc.
@type: uast:Alias
Field | Type | Description |
---|---|---|
Name |
uast:Identifier |
A name that is assigned to an entity. |
Node |
Node |
An entity that will be aliased by a new name. |
Imports are statements that can load external modules into a program or a library.
Import declaration can be described as a static statement in the sense that an effect of it is not affected by code execution and is not affected by the position of the node inside UAST.
An Import
can either:
- Register all exported symbols in the target scope (
All == true
). - Register specific symbols in the target scope (
len(Names) != 0
). - Act as a side-effect import (both
All
andNames
field are not set).
@type: uast:Import
Field | Type | Description |
---|---|---|
Path |
uast:String OR uast:Identifier OR uast:QualifiedIdentifier OR uast:Alias |
A name that is assigned to an entity. |
All |
Bool |
Import all definitions from the modules into the scope. |
Names |
[] (uast:Alias OR uast:Identifier ) |
Import specific names from the module. Can refer to an uast:Alias to rename imported entities. |
Runtime import has the same structure as an import declaration, but have slightly different semantics. Runtime import may appear anywhere in the code, thus it may be affected by code execution.
@type: uast:RuntimeImport
Inherits: uast:Import
Runtime re-import has the same semantics as Runtime Import, but it will re-execute an initialization code when importing the same package the second time.
@type: uast:RuntimeReImport
Inherits: uast:RuntimeImport
Generic grouping node containing other nodes, used as common ancestor for inheriting in other more specific types.
@type: uast:Group
Field | Type | Description |
---|---|---|
Nodes |
[]Nodes |
Grouped nodes. |
Group containing the nodes for a function definition.
@type: uast:FunctionGroup
Inherits: uast:Group
Node representing a function definition. Usually will be inside a uast:Alias
node holding the function's name. This in turn will be inside a uast:FunctionGroup
.
@type: uast:Function
Field | Type | Description |
---|---|---|
Type |
uast:FunctionType |
Function type definition including format arguments and return type. |
Body |
uast:Block |
Body of the function definition. |
Node representing the type signature of a function definition.
type: uast:FunctionType
Field | Type | Description |
---|---|---|
Arguments |
[]uast:Argument |
Format arguments. |
Returns |
[]uast:Argument |
Return type/s. |
Argument or return value, usually for a uast:FunctionType
.
type: uast:Argument
Field | Type | Description |
---|---|---|
Name |
uast:Identifier |
Argument name. |
Type |
Any |
Type specification (int , string , etc). |
Init |
Any |
Default value, if given. |
Variadic |
bool |
True if it takes a variable number of arguments in a list-like format. |
MapVariadic |
bool |
True if it takes a variable number of arguments in a map-like format. |