Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The issues with "rooted" library subtrees in standard schema (linked to issue #52) #56

Closed
VisLab opened this issue Jan 21, 2023 · 13 comments

Comments

@VisLab
Copy link
Member

VisLab commented Jan 21, 2023

The only part of the score library that overlaps significantly with the standard HED schema is the Body-part tree.

Score library excerpt

 '''Finding-property'''
 * Location-property 
        ** Brain-laterality ...
        ** Brain-region ...
        ** Body-part 
            *** Body-part-eyelid
                **** <nowiki># {takesValue, valueClass=textClass}[Free text.]</nowiki>
            *** Body-part-face
                **** <nowiki># {takesValue, valueClass=textClass}[Free text.]</nowiki>
            *** Body-part-arm
                **** <nowiki># {takesValue, valueClass=textClass}[Free text.]</nowiki>
            *** Body-part-leg
                **** <nowiki># {takesValue, valueClass=textClass}[Free text.]</nowiki>
            *** Body-part-trunk
                **** <nowiki># {takesValue, valueClass=textClass}[Free text.]</nowiki>
            *** Body-part-visceral
                **** <nowiki># {takesValue, valueClass=textClass}[Free text.]</nowiki>
            *** Body-part-hemi
                **** <nowiki># {takesValue, valueClass=textClass}[Free text.]</nowiki>
        ** Brain-centricity {requireChild} ...

Standard library excerpt

'''Item''' 
* Biological-item 
** Anatomical-item 
*** Body 
*** Body-part
**** Head 
***** Hair 
***** Ear 
***** Face 
****** Cheek 
****** Chin 
****** Eye 
****** Eyebrow 
****** Forehead 
****** Lip 
****** Nose
****** Mouth
****** Teeth 
**** Lower-extremity
***** Ankle
***** Calf
***** Foot ....
**** Torso ...
***** Torso-back ...
**** Upper-extremity ...

Observation:
Leg in SCORE has path Finding-property/Location-property/Body-part/Body-part-leg, while leg in the standard schema is:
Item/Biological-item/Anatomical-item/Body-part/Lower-extremity.

  1. Notice that these two synonyms for Leg are actually different. In Score the tag is specifying where, while in the standard schema it is specifying what. Even if the tags have exactly the same name (e.g., Body-part) they still are different.

  2. Notice that the Score term Leg takes a value, meaning that it expects a text probably clinical description of the place and other observations included in the finding. The standard-library Lower-extremity has many children tags.
    A tag cannot both have a # child and other types of children, so these can't be rooted.

  3. There is also a technical problem with searching because when you root one schema node in another, you have multiple ancestors.
    Since other things could be rooted in the trees above, you don't really have a hierarchy any more, but a graph.

Ontologies don't have this problem because they don't have any hierarchical structure at all. They just use relationship links to create a potentially very complex graph. How they search or what it means is up for grabs.

As we introduce various schema attributes expression relations, we can maintain the hierarchy
for search as follows. Eventually these attributes would be also translated into ontology relationships.

Proposal:

Introduce a schema attribute equivalentTo which would only be for a library schema with a standard schema partner. It would allow the library schema to say this term is "exactly" the same as some term in the standard schema.

Example: Body-part in score could have the attribute equivalentTo=Body-part. (Note: I'm not sure this is true in this case, but suppose it is.) The search could have a create a table so that when someone searched for Body-part it would recognize both Leg and Lower-extremity. You are guaranteed a search graph with no cycles.

Note: this would minimally affect validation. The effects would be downstream on search.

Comments and for discussion at HWG: @smakeig @tpatpa @dorahermes @dungscout96 @IanCa @happy5214

@VisLab
Copy link
Member Author

VisLab commented Jan 21, 2023

@tpatpa @dorahermes with an eye to having a standard schema partner in the next release, would you consider revising the tags in the Location-property subtree:

For example:
Body-part -> Body-part-location
Body-part-eyelid -> Eyelid-location

Etc.

@smakeig
Copy link
Member

smakeig commented Jan 22, 2023

As the use of /# in SCORE as 'a place for clinical notes' is not really in the spirit of HED, perhaps we could differentiate /# = [coded attribute, either numeric or single-word defined term] vs. /## "text note" - cf. my unease about HED NOT having any define 'Comment_start' character...

@tpatpa
Copy link
Collaborator

tpatpa commented Jan 23, 2023

Thank you for clarifying the differences, specifically, the effects on search.

Observation: Leg in SCORE has path Finding-property/Location-property/Body-part/Body-part-leg, while leg in the standard schema is: Item/Biological-item/Anatomical-item/Body-part/Lower-extremity.

  1. Notice that these two synonyms for Leg are actually different. In Score the tag is specifying where, while in the standard schema it is specifying what. Even if the tags have exactly the same name (e.g., Body-part) they still are different.
  2. Notice that the Score term Leg takes a value, meaning that it expects a text probably clinical description of the place and other observations included in the finding. The standard-library Lower-extremity has many children tags.
    A tag cannot both have a # child and other types of children, so these can't be rooted.
  3. There is also a technical problem with searching because when you root one schema node in another, you have multiple ancestors.
    Since other things could be rooted in the trees above, you don't really have a hierarchy any more, but a graph.

The use of equivalentTo schema attribute sounds good!
and I think this revision makes sense.

@tpatpa @dorahermes with an eye to having a standard schema partner in the next release, would you consider revising the tags in the Location-property subtree:

For example: Body-part -> Body-part-location Body-part-eyelid -> Eyelid-location

Etc.

I'll make this change so we can proceed with the schema release .

@smakeig
Copy link
Member

smakeig commented Jan 23, 2023 via email

@VisLab
Copy link
Member Author

VisLab commented Jan 23, 2023

So this proposal only pertains to nodes that have no descendants in the standard schema?

Can you propose a development path when either the library schema and the standard schema want to extend?
Will they have to extend in the same way?

What if the standard schema doesn't at all like the way the library schema had developed below it, but the library schema thinks the way it did it was great? Will the subtree become unrooted and then how will that play out in consistency across annotations.

Further, suppose there is another library schema (B say) that had also extended at that root, but in a different way and there were items interspersed through the two library schemas in different ways. How will the standard schema move forward with expansion in the future? This is similar to the case when a large city wants to annex surrounding areas, but there are various tiny incorporated municipalities in the way.

@VisLab
Copy link
Member Author

VisLab commented Jan 23, 2023

To clarify.... is the rooted-to just for searching and display (like equivalent-to) so that the standard schema doesn't have to worry about it at all?

@happy5214
Copy link
Member

Can the multiple-inheritance issue be solved by duplicating the rooted sub-hierarchy at runtime into a separate set of "virtual" tags that are marked as equivalent to the original tags?

As for the issue of the standard schema expanding, if we're requiring library schemas to declare compatibility with a standard schema version, that's just a break in compatibility they'll have to fix in a new version. The tools would be restricted to searching for the patch versions in the same minor version series of the standard schema, rather than minor versions in the major series, due to the possibility of added tags in minor versions.

@smakeig
Copy link
Member

smakeig commented Jan 23, 2023 via email

@smakeig
Copy link
Member

smakeig commented Jan 23, 2023

Discussing with Dung today - it seems to me that perhaps there are two courses of action open to us re BASE vis SCORE libs...

  1. keep both libs as is (with the exception suggested by Kay).
  2. now or in future refactor the SCORE lib to nest in the BAS lib - keeping the SCORE lowest-level terms the docs are used to , but adapting the hierarchies to fit the sense of HED in the BAS lib - with the thought that all the neurophys' will care about are the lowest level of the hierarchy, as these are the terms they will actually see in the user interface.
  3. leave HED SCORE as is (with the Kay suggestion exception), but build another HED lib schema for 'EEG events' that reproduces most of the SCORE terminology without enforcing its clinical origin and objectives - e.g., using 'leg' to mean a body-part, 'alpha burst' to be an EEG-event, etc. This would give us a HED compatible term lexicon, formally segregated into a (base-nesting) library schema, whose completeness would be inspired and educated by SCORE.

Scott

@dorahermes
Copy link
Member

Thank you! This clarifies a lot, I think 1 would work best for now (with suggestions from Kay) and 2 or 3 are options to consider in the future.

@VisLab
Copy link
Member Author

VisLab commented Jan 26, 2023

The HED Working Group discussed this at the 1/26/2023 meeting. The group concurred that Option 1 is the best for now so that things could move forward.

The only change is that the subtree Finding-property/Location-property/Body-part would now be
Finding-property/Location-property/Body-part-location.

All of the items currently under Body-part would be moved under Body-part-location and be renamed so that
Body-part-xxx would become Xxx-location.

This would remove potential conflicts if the next release of score wanted to "partner" with a standard schema.

Once these changes are made, we would go ahead and create an official release of score 1.0.0!
@dorahermes @tpatpa

@tpatpa
Copy link
Collaborator

tpatpa commented Jan 26, 2023

Great, changes were made, see latest commit.

@VisLab
Copy link
Member Author

VisLab commented Feb 2, 2023

From @smakeig

... courses of action open to us re BASE vis SCORE libs...

1. keep both libs as is (with the exception suggested by Kay).

2. now or in future refactor the SCORE lib to nest in the BAS lib - keeping the SCORE lowest-level terms the docs are used to , but adapting the hierarchies to fit the  sense of HED in the BAS lib - with the thought that all the neurophys' will care about are the lowest level of the hierarchy, as these are the terms they will actually see in the user interface.

3. leave HED SCORE as is (with the Kay suggestion exception), but build another HED lib schema for 'EEG events' that reproduces most of the SCORE terminology without enforcing its clinical origin and objectives - e.g., using 'leg' to mean a body-part, 'alpha burst' to be an EEG-event, etc.  This would give us a HED compatible term lexicon, formally segregated into a (base-nesting) library schema, whose completeness would be inspired and educated by SCORE.

We have decided on Option 1 with future work on Options 2 and 3.

Score 1.0.0 has been released.

A new issue (#62) about the mechanics of rooted schema has been created.

@dorahermes @tpatpa @dungscout96

@VisLab VisLab closed this as completed Feb 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants