-
Notifications
You must be signed in to change notification settings - Fork 5
NexSON
NexSON is a translation of NeXML to JSON using the BadgerFish conventions. Each NexSON file represents a Phylografter study, though it may contain exactly one or all the trees included in the study. The set of elements used in these files is (currently) limited to nexml, otus, otu, trees, tree, node, and edge - note that there is no current support for data matrices. There is also an associated metadata vocabulary currently used for annotating elements of type nexml, otu, tree, and node.
As of October 2013, discussion of syntax of validation-related annotations is taking place on the annotations page. When that discussion stabilizes the content there will be migrated to this page.
See our page on our XML to JSON syntactic mapping
OpenTree's NexSON metadata vocabulary uses the URI prefix http://purl.org/opentree/nexson#, which is abbreviated to ot:. The vocabulary consists of a number of predicates, and a set of terms specifying a choice of values for a particular predicate. The predicates and the types of their values are listed in Table I.
###Table I. Predicate Vocabulary
Currently only the ot:tag
and the ot:candidateTreeForSynthesis
meta predicates can be found more than one time in an element.
This information affects the syntax used in the new (and not yet implemented) mapping between XML and JSON
We are not using @label in the otu (see ot:originalLabel, ot:ottTaxonName, and ot:altLabel)
Element | Name | Type | Description |
---|---|---|---|
Nexml (study) | ot:studyPublicationReference | (long) string | A reference (bibliographic citation string) to the publication describing the associated study |
ot:studyPublication | URI | URI (DOI preferred, in the form http://dx.doi.org/10....) identifying the publication describing the associated study | |
ot:studyYear | integer | Year study was published | |
ot:curatorName | string | Name of the person who curated this study in opentree | |
ot:dataDeposit | URI | The data publication in which the data in this nexml object may be found, e.g. a link to Treebase or a DOI-URI pointing to Dryad | |
ot:studyId | string | Short identifier used by phylografter for the study. | |
ot:focalClade | integer | OTT id of root of clade specified as focal in the study | |
ot:focalCladeOTTTaxonName | string | label (name) assigned for this node, if any (else empty string) | |
ot:tag | string | tag attached to study; may indicate deprecation; may occur multiple times | |
ot:notIntendedForSynthesis | boolean | default = false; curator can choose this to relax validation (allow un-rooted trees, unmapped taxa, etc) | |
ot:candidateTreeForSynthesis | string | id of the tree marked as a candidate for synthesis; may occur multiple times? | |
ot:taxonLinkPrefixes | object | key/value pairs mapping the keys of the ot:taxonLink objects to the prefix of the url needed to convert the identifier to a URL | |
xhtml:license | object | Standard element, used to add CC0 waiver or other license information, e.g. {'@href': 'URL'}
|
|
ot:comment | string | curator provided comment for study | |
ot:lastmodified | datetime (string) | phylografter's last modified time; should be converted to an annotation and deprecated | |
ot:uploaded | datetime (string) | time when study was first loaded/created in phylografter - convert to annotation? | |
otu | ot:ottId (was ottolid then ottid) | integer | taxon id from OTT |
ot:ottTaxonName (was in node) | string | the name of the ott taxon | |
ot:originalLabel | string | label (name) assigned the otu in uploaded tree | |
ot:treebaseOTUId | string | Treebase id for otu (for studies from treebase) | |
ot:taxonLink | object | keys are tags for different taxonomic services (such as "@ubio") the values are the identifiers for this taxon in that taxonomic service | |
ot:altLabel | string | labile field of edited label that is not equal to ot:originalLabel presumably if the mapping succeeds, this field will be deleted. This is used for the relatively rare case in which a curator improves a otu label, but not enough to successfully map. | |
tree | ot:branchLengthMode | choice | Table II |
ot:nodeLabelMode | choice | Table II | |
ot:inGroupClade | string | id of the node tagged as root of ingroup root (note this node may not have an assigned otu) | |
ot:nearestTaxonMRCAName | string | name of a taxon in the OT taxonomy (calculated in curation app or elsewhere) closest to the MRCA of all mapped tip OTUs in this tree; this should be checked against the stated focal clade for the entire study | |
ot:nearestTaxonMRCAOttId | string | OT taxonomy id of the taxon calculated in ot:nearestTaxonMRCAName above |
|
ot:specifiedRoot | string | id of the node tagged as root of the tree. This node should be the same as the node bearing the @root identifier. Neither this nor the @root tag necessarily indicate that the rooting is biologically meaningful (see 'ot:unrootedTree' below). Note: phylografter does not write a value for this, but will in the future | |
ot:unrootedTree | boolean | since Nexson trees are necessarily rooted, this flag determines whether the root is meaningful (biologically correct) or arbitrary | |
ot:tag | string | tag attached to the tree; may indicate deprecation or inference method; may occur multiple times | |
ot:branchLengthTimeUnit | string | currently phylografter only writes "Myr", which reflects its internal default. Not meaningful if ot:branchLengthMode is not ot:time | |
ot:curatedType | string | curator provided type of tree; should specify inference method as text | |
ot:tbTreeId | string | if the tree was imported from Treebase, this is the id for the tree in treebase | |
ot:contributor | string | name of person contributing the tree | |
ot:uploaded | datetime (string) | time tree was uploaded (might not be same as study upload time | |
ot:branchLengthsComment | string | comment describing values used for branch lengths | |
ot:cladeLabelsComment | string | comment describing the labels on internal nodes | |
ot:authorContributed | boolean | true if tree contributed by the study author | |
ot:comment | string | comment pertaining to whole tree | |
node | ot:isLeaf | boolean | a boolean set to true on terminal otu nodes. This is redundant with having no edges that refer to the node as a source. It is included to enable fast checking of whether a node is a leaf. |
ot:isTaxonExemplar | boolean | used only in cases where multiple nodes are mapped to a single Open Tree taxon (via their assigned OTUs). This will be true for the chosen exemplar, false for all others. | |
@nexml2json | string | specifies the syntactic mapping. Currently supported values are "1.2.1", "1.0.0", "0.0.0" if missing, we assume you are using "0.0.0" | |
ot:age | double | age of a node; presumably in the same units as branch lengths; may be invalid if tree is rerooted | |
ot:ageMin | double | minimum age of a node; presumably same (temporal) units as branch lengths; if fossil derived, may not change after rerooting | |
ot:ageMax | double | maximum age of a node; see ot:ageMin | |
edge | ot:bootstrapSupport | double | support for an edge as a bootstrap percentage |
ot:posteriorSupport | double | support for an edge as a posterior probability | |
ot:otherSupport | double | another support value; see ot:otherSupport | |
ot:otherSupportType | string | specifies what the value in ot:otherSupport measures |
###Table II - Object (value) vocabulary
Element/Predicate | Name | "meaning" |
---|---|---|
tree / ot:branchLengthMode | ot:substitutionCount | branch lengths represent number of substitutions |
ot:changesCount | branch lengths represent number of changes | |
ot:time (was ot:years) | branch lengths represent time. Units specified with ot:branchLengthTimeUnit | |
ot:bootstrapValues | branch lengths represent bootstrap values | |
ot:posteriorSupport | branch lengths represent posterior support values | |
ot:other | branch lengths represent defined values but are not among the known types, refer ot:branchLengthDescription | |
ot:undefined | branch lengths represent undefined values | |
tree / ot:nodeLabelMode | ot:taxonNames | node labels represent taxon names |
ot:bootstrapValues | node labels represent bootstrap values | |
ot:posteriorSupport | node labels represent posterior support | |
ot:other | node labels respresent defined values but are not among the known types, refer ot:nodeLabelDescription | |
ot:undefined | node labels represent undefined values | |
ot:rootNodeId | in v1.2 only. Id of the root. Says nothing about intent, just makes it faster to build the tree |
###Table III - Proposed (not yet implemented) Predicates
Element | Name | Priority | Type | Description |
---|---|---|---|---|
nexml | ot:studyLabel | String | ||
ot:studyUploaded | Medium | String (datetime) | Time stamp for when study was initially uploaded | |
ot:studyModified | Medium | String (datetime) | Time stamp for when study was last modified | |
ot:studyLastEditor | Medium | String | Username of last user to modify | |
tree | ot:nodeLabelMode (was ot:cladeLabelMmode) | Medium | choice | see Table II |
ot:nodeLabelDescription (was ot:cladeLabelsComment) | String | |||
ot:branchLengthDescription (was ot:branchLengthComment) | String | |||
ot:branchLengthTimeUnit | String (can be choice) | the unit of time used for the branch lengths. has no meaning if the value of ot:branchLengthMode is not ot:time | ||
ot:inferenceMethod | String (can be choice) | the type of inference method used to infer this tree. E.g. parsimony, likelihood, bayesian, distance, etc. | ||
ot:authorContributed | High | choice | Many trees indicate this in a type field. This is boolean. | |
ot:treebaseTreeId | High | |||
ot:comment | Medium | String | ||
ot:treeModified | Medium | String (datetime) | Time stamp for when tree was modified (not necessary same time as study) | |
ot:treeLastEdited | ||||
ot:curatorType | Medium | String | In many cases this will be the inference method, but may be other free text | |
node | ot:cladeLabel | String | pertains to clade rooted at node; see ot:clade_label_mode | |
ot:isIngroup | Boolean | a boolean set to true on the most inclusive ingroup node (ingroup root) | ||
ot:parent | Low | |||
ot:age | Medium | Number | assigned age | |
ot:ageMin | Low | Number | lower bound of assigned age | |
ot:ageMax | Low | Number | upper bound of assigned age | |
ot:bootstrapSupport | Medium | Number | (this appears to be redundant with the branch length mode) | |
ot:posteriorSupport | Medium | Number | (this appears to be redundant with the branch length mode) | |
ot:otherSupport | Low | Number | (this appears to be redundant with the branch length mode) | |
ot:otherSupportType | Low | String | specifies alternative support statistic (this appears to be redundant with the branch length mode) | |
ot:originalRoot | High | Boolean | The first time a tree is rerooted, it should note the original rooting position by flagging the node that was the original root of the tree. |