Skip to content

Syntax Definitions

Garry Pettet edited this page Oct 19, 2023 · 21 revisions

A syntax definition is an XML file that describes to SyntaxArea how to highlight a syntax (language).

Examples of definition files can be found in the definitions folder and their effects can be seen within the demo application.

Structure

Below is a simplified example of ObjoScript's language definition. The full version can be found within definitions/Objo.xml. Note that regular expressions must be escaped for use within XML. This means symbols's such as > must be represented as > see this excellent StackOverflow answer for a complete list.

Example

<highlightDefinition>
  <name>Objo</name> <!-- The name of the language. Mandatory -->
  
  <!-- This pattern stipulates the start and end of a block. Multiple block start and end tags can be used -->
  <blockStartMarker indent="1">\{\s*(?:$|#)</blockStartMarker> <!-- `indent` stipulates how much to indent. If in doubt, use 1 -->
  <blockEndMarker>^\s*\}</blockEndMarker>

  <!-- Symbols are optional. These represent tokens in the source code that have a semantic meaning. 
  They are accessible within the editor and can be navigated to. This can be seen in the demo application by clicking on the 
  symbol bar beneath the editor text. This example just has one symbol but you can have zero or many. -->
  <symbols>
    <symbol type="class">
      <entryRegEx>^\s*class [^ {]*</entryRegEx>
    </symbol>
  </symbols>

  <!-- Each definition file should have at least one `contexts` element. Nested within `contexts` are `highlightContext` nodes which 
  define the regular expressions to match tokens in the editor's text that can be styled. You can stipulate the case sensitivity at the `contexts` or `highlightContext` level. -->
  <contexts caseSensitive="yes"> <!-- ObjoScript is case-sensitive -->
    <!-- Each highlight context should have a unique name. The editor will look for a style in the current theme matching this
    this name in order to style the token. Themes are encouraged to define a handful of common style names. Thanks to this you can
    define a "fallback" name to use in case the theme currently being used doesn't define a style for "name". You can see an
    example of this in the "uppercaseIdentifier" context below. -->
    <highlightContext name="comment">
      <startRegEx>#</startRegEx> <!-- everything between `startRegEx` and `endRegEx` will be a "comment" -->
      <endRegEx>[\n\r]</endRegEx>
    </highlightContext>

    <highlightContext name="uppercaseIdentifier" fallback="identifier">
    <!-- Anything matching `entryRegEx` pattern will be styled as "uppercaseIdentifier". If the current theme doesn't define
    this style then "identifier" will be used. -->
      <entryRegEx>\b[A-Z]\w*</entryRegEx>
    </highlightContext>

    <highlightContext name="intrinsicType", fallback="identifier">
    <!-- Any strings exactly listed within a `keywords` element will be matched.-->
      <keywords>
        <string>Boolean</string>
        <string>Number</string>
      </keywords>
    </highlightContext>
  </contexts>
</highlightDefinition>
Clone this wiki locally