Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEP 058 -- Aliases #124

Open
jakebeal opened this issue Apr 23, 2023 · 18 comments
Open

SEP 058 -- Aliases #124

jakebeal opened this issue Apr 23, 2023 · 18 comments

Comments

@jakebeal
Copy link
Contributor

Genetic parts often have more than one identifier, either for historical reasons or because there are "nickname" versions of an identifier. This SEP proposes a means of tracking the relationship between the canonical name for a part (or other object) and a set of aliases that can be used to redirect to it from other identities.

Full draft in: https://github.com/SynBioDex/SEPs/blob/master/sep_058.md

@JMante1
Copy link

JMante1 commented Apr 24, 2023

Are aliases meant to only be for SBOL objects? I wonder if aliases and instanceof should be integrated into a singular proposal? That way the same object in GenBank, Addgene, SBOL, etc can all be addressed and looked up in the same manner.

@jakebeal
Copy link
Contributor Author

I believe that importing materials from outside of SBOL needs to be handled separately, since the means of accessing those objects is necessarily different, and they need to be converted when imported in any case. I would also be reluctant to mess with the semantics of instanceOf, which are well established and used by a number of tools.

@JMante1
Copy link

JMante1 commented Apr 24, 2023

I take your point, however, I would like to be clearer on instanceOf versus alias. In the current proposal Alias is for any similar SBOL objects (there may be different triples but the sequence must be the same?). Whilst instanceOf is for any similar (sequence must be the same) external SBOL objects? Is the purpose of alias mainly to allow merging in future, to act explicitly as alternative names (in a more ontological sense), or to indicate that there may be more information about the sequence elsewhere?

@jakebeal
Copy link
Contributor Author

The two are very different: the instanceOf property is used to indicate that an SBOL Component is being used as a SubComponent of a larger design.

Perhaps you are thinking of ExternallyDefined, which is a type of Feature that can be used for including things that are not intended for conversion into SBOL, such a small molecule defined by a PubChem entry. This is an alternative approach to the SEP 054 approach to imports, but both are orthogonal to the question of aliases.

@JMante1
Copy link

JMante1 commented Apr 24, 2023

You are right. Was thinking of ExternallyDefined (Monday morning!). I am not sure that externallyDefined is orthogonal to alias. Aren't both ways of referring to additional instances of an object (though one in SBOL and the other outside)? For the question are there alternative names/versions of componentA, wouldn't you want to know it is also an iGEM component, a GenBank component, and there is an alternative SBOL version?

@jakebeal
Copy link
Contributor Author

Here is the distinction that I see:

  • If we convert an object into SBOL, the standard practice is to use prov: wasDerivedFrom and prov:wasGeneratedBy to show the relationship between the SBOL object and the external object. The two are NOT identical, and may legitimately contain fundamentally different information, but describe the same "real world" concept.
  • An Alias is a relationship between two SBOL objects that indicates that not only is the "concept" the same, but that their information would need to be kept in sync, so we are avoiding forks by designating one as the "primary" name and keeping the information only in that one place.
  • ExternallyDefined says "This is not SBOL, and should not be expected to be"

@jakebeal
Copy link
Contributor Author

jakebeal commented May 1, 2023

Given that this proposal had an overall good reception in discussions at HARMONY next week, and that there have been no further concerns raised in the past five days, I would like to move this forward for a vote. If there are any additional concerns, please raise them now.

@Gonza10V
Copy link
Contributor

Gonza10V commented May 3, 2023

With the implementation of "resolve aliases " after reading a Document you can always get rid of any Alias, so is mainly a curation tool or has usage for further development as well? Creating a new document, is there a case where is better to add an Alias instead of the canoncical Component?
In the Example:Multiple identities "Each of the duplicates can then be replaced by an Alias with the same identity and a referent of https://github.com/iGEM-Vectors/pSB1C5", why not just replace with the canonical copy if you need to know the referent anyways? Why it should retain its identity? What is the advantage of using an Alias?

@jakebeal
Copy link
Contributor Author

jakebeal commented May 3, 2023

@Gonza10V There are two key cases where it is advantageous to use an alias:

  1. As a forwarding address when a TopLevel changes identity. If I build a package and other people build dependencies on my package, I want to be able to change the identity of the objects in my package, but I don't have control over the packages that depend on me. An Alias can be used to deprecate an old identity without breaking dependencies.
  2. To allow the use of alternative "nickname" identities in references. For example, if I put together an Excel package description that refers to E0040, then all that Excel-to-SBOL needs to know is that E0040 is the identity of a TopLevel object. Actually resolving it, remapping from E0040 to BBa_E0040, and checking that it's the type of Component that is needed are all tasks that are better suited a later stage in the workflow.

@Gonza10V
Copy link
Contributor

Gonza10V commented May 3, 2023

@jakebeal Per discussion in SBOL Editors meeting there are some concerns about its implementation, as it would create more work for library maintainers like @tcmitchell and @goksel . The libraries still dont cover 3.1 and we will have a 3.2 specification. Does Alias needs to be implemented in the specification or can be implemented as an utility (probably in sbol-utilities)? If can be implemented as a utility, it still needs to be a class or can be a function? can you use "resolve aliases " to avoid validation errors, or is there another workaround to avoid validation errors?

@jakebeal
Copy link
Contributor Author

jakebeal commented May 3, 2023

The Alias functionality definitely needs to be implemented as class, so that it can be included in SBOL data files. You cannot consistently handle either of the use cases I listed above with a utility or function.

We could, of course, implement Alias outside of the specification as an extension class, and will likely need to do so if it can't be included in the specification. However, that path will also involve bending or breaking validation rules, which will cause its own problems for other libraries.

@cjmyers
Copy link
Contributor

cjmyers commented May 4, 2023

I agree with @jakebeal that this must be implemented in SBOL as a class. Validation rules are going to need to be updated.

Also, in SBOL, the standard practice is the specification leads the implementation. We do not typically implement things unless they are in the specification, so if something is a good idea, we add it to the specification first, and the libraries catch up.

@PrashantVaidyanathan
Copy link
Contributor

I'm going to preface this by saying that having an Alias as a class is great and will be a very useful for SBOL in general.

That being said, I think we will need to set expectations as to the expected timeline for incorporating these changes in the libraries and tools (and this is more of a generic comment and not targeted at this specific SEP - although it is highly relevant here). If we want to vote this in now because it is a good idea but expect this to be incorporated at a later date (let's say a year from now), that's great.

If the expectation is that these must be incorporated in the libraries immediately, I think that will be difficult unless we find interested members of the community who would like to contribute to the libraries and/or the library developers have the bandwidth to update their respective libraries. I'm currently working on trying to increase our community engagement - but this is going to take some time.

@jakebeal
Copy link
Contributor Author

I'm fine with setting a schedule for an expected adoption time. I just want this to be an intentional choice and not having this sitting in limbo. Among other things, that will allow libraries to move at different paces in implementation.

@Gonza10V Gonza10V self-assigned this May 16, 2023
@PrashantVaidyanathan
Copy link
Contributor

That's perfectly reasonable. We are going to send this out for voting soon. We will schedule this immediately after the editor voting (you'll see an email for this very soon).

@jakebeal
Copy link
Contributor Author

@PrashantVaidyanathan : I don't believe I've seen a schedule for the vote yet. Have I missed something?

@PrashantVaidyanathan
Copy link
Contributor

We will be scheduling this vote next week - so it's in the pipeline. We were waiting for the SBOL editor voting to end first to avoid confusion with the voting process.

@Gonza10V
Copy link
Contributor

Now in the queue for votation, spected to be started on Friday 28.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants