Skip to content

The MergeSchema

Steffen Holzer edited this page May 19, 2016 · 5 revisions

The merge schema

In this section the merge schema and it’s elements will be described.

The document root <merge-schema>

The <merge-schema> element is the root of any merge schema. Since it’s extends the <handling> element, only root attribute will be described.

Most namespaces define a single possible root for its documents although some namespaces definie multiple possible roots. The root attribute specifies these two options. Strictly speaking the root attribute has to be set to false if there is not exactly one possible root element. The attributes default is true.

If the attribute is set to false the merge process will consider one of the children of <merge-schema> as its new merge schema root.

The <merge-schema> element has exactly one mandatory <definition> element and an optional <default-criterion> element as well as the children of an <handling> element.

Defining the namespace with <definition>

The <definition> element specifies the namespace this merge schema describes. The URI of the namespace has to be set in the mandatory namespace attribute. For the optional validation of the merge result the attributes location and type should be set. location specifies the location of the namespaces definition. type specifies the type of the definition. Default and only at the moment working option is xsd.

In some cases (i.e. using xsd’s from the Spring framework) it can be desirable to create a merge schema for multiple namespaces. In that case it is possible to place <additional-namespace> elements under the <definition> element. The <additional-namespace> element has the same attributes as the <definition> element altough the information of the <definition> element will always considered first.

Defining the default matching with <default-criterion>

To prevent the merge schema from redundancies it is possible to specify a default matching rule. The exact function of the xpath and ordered attributes will be described in the <criterion> section.

The <default-criterion> will always be plugged in when an <Handling> element doesn’t have its own <criterion> element.

The element merge rule <handling>

The <handling> element describes a merge rule for a specific element in the target namespace as well as its children, attributes and textual contents. The element on that the rules will be applied is specified in the for attribute. The for attribute must contain the local name of the element.

In most of the cases an element can occur once or arbitarily often. And even if an element is allowed by its namespace to occur more than once it may be desirable to limit the occurence of this element. To ensure an element will be unique in its axis in the merge result, the unique attribute can be set to true. If set to true the algorithm will exit with errors if the specified elements in base and patch can’t be merged in a single element.

The textual content (i.e. the text nodes) of an element are considered an attribute like property of their parent element. In some cases it may be useful and valid to merge textual content from base and patch. To declare the textual content of an element to be mergeable the boolean flag attachable-text can be set to "true". The default behavior is "false which means that the text nodes cannot be merged. Note that this flag only describes the possibility to merge the textual content. If the merge algorithm actually will merge the text nodes depends on the used ConflictHandlingType.

Since it may be desirable to define multiple parallel rules for one element (i.e. if the element is the first in a container or not) the where attribute can be used to distinguish between those rules. The where attribute contains similiar to <criterion>'s xpath attribute a valid XPath expression altough where_s expression has to be evaluated into a boolean value. Therefore _where uses the default value true().

In some cases it may be necessary to redefine a merge rule for a specific element tag (i.e. a xsd allows to define the type of an element independent from it’s tag). On the other hand it would be highly redundant to define the same <handling> element over and over again everytime the element can occur. To handle these two issues each <handling> element has a visibility and can be overwritten. As stated in The merge process page each <handling> is visible for ./parent::*/descendant::* and therefore needs only to be defined once at top level. An <handling> element overwrites another further up the document tree if both the for and the where attribute are identical.

Each <handling> suffices the regular Term (<criterion>*,<handling>*,<attribute>*).

Refering other merge schemas

Namespaces allow to extend elements even from other namespaces. To handle this structural inheritance in the merge process, the algorithm loads the base namespace with the needed <handling> elements. The base <handling> to be used has to be declared with a document wide unique name in the label attribute. This label can be referenced in the extending <handling> element via the scope-ref and namespace-ref attributes, where the scope-ref contains the unique name of a label attribute and the namespace-ref the namespace, used to point to the specific merge schema for that namespace.

Defining matchings with <criterion>

To determine if an element in the base document can be matched with an element in the patch the algorithm evaluates the <criterion> elements in the specific <handling>. The use of a <criterion> is optional (see the <default-criterion> section).

The properties to determine a match between two elements the merge schema uses XPath. The xpath expression defining the path to the properties (relativ to the evaluated element) is stored in the <criterion>s xpath attribute. Two elements are matched with each other if their properties specified by the xpath attribute match. If an <handling> contains more than one <criterion> two elements match if their properties specified by all xpath attributes match (The <criterion> elements evaluations are combined via AND).

If the xpath evalutation returns a set of XML elements (i.e. "./child::*") they’re compared to each other via their String representation. If the ordering of those nodes is relevant this can be specified via <criterion>'s ordered attribute. Setting ordered to true two elements match if and only if the xpath expression returns the same nodes in the same ordering. Default is false.

Defining merge rules for attributes with <attribute>

The <attribute> element specifies the merge behavior of an attribute. The name of the attribute is set in the for attribute of <attribute>. If the attributes value can be attached during a merge (i.e. a list) this can be specified via the attachable attribute (true if the attributes value is attachable). If the values need to be seperated by any sign (i.e. a semicolon) this can be specified in the separation-string attribute. Default is the emtpy string. Setting an <attribute> element is optional. If none is set for a given attribute attachable="false" will be used.

A small example

<merge-schema for="the-root">
  <definition namespace="http://example.com">
    <additional-namespace namespace="http://example-v1.com"/>
  </definition>
  <default-criterion xpath="true()"/>
  <handling for="element-a">
    <criterion xpath="./@id"/>
    <handling for="child-a" unique="true">
      <attribute for="list" attachable="true" separationString=","/>
    </handling>
    <handling for="child-b" scope-ref="someLabel" namespace-ref="http://otherNamespace.com">
      <criterion xpath="./child::child-b" ordered="true"/>
    </handling>
  </handling>
</merge-schema>