Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Referencing Terms Within a Classification Scheme #9

Open
c-alpha opened this issue Feb 9, 2022 · 6 comments
Open

Referencing Terms Within a Classification Scheme #9

c-alpha opened this issue Feb 9, 2022 · 6 comments
Assignees

Comments

@c-alpha
Copy link

c-alpha commented Feb 9, 2022

When accessing the test server via the query API, the responses contain elements with @href values pointing at Classification Schemes.

Example:

<Genre href="urn:tva:metadata:cs:ContentCS:2011.3.6.8.18"/>
<Genre href="urn:tva:metadata:cs:ContentCS:2011.3.8.7.6"/>
<Genre href="urn:tva:metadata:cs:ContentCS:2011.3.2.14.6"/>
<Genre href="urn:tva:metadata:cs:FormatCS:2011.2.1.4"/>

It seems that the 4-digit numbers ("2011" in this example) are part of the name of the Classification Scheme, while the sequence of numbers following (e.g. "3.6.8.18") seems to be intended to designate the termId within the named Classification Scheme. This is ambiguous, and hence seems difficult to parse reliably.

The EBU historically recommends using the fragment separator # (hash, as in http URLs). The relevant EBU recommendation is in clause 3.5 of https://tech.ebu.ch/docs/tech/tech3336.pdf.

Also, more formally, and more recently, RFC 8141 (which defines the urn scheme) specifies the f-component (which preceded by a #) of a URN as:

2.3.3. f-component

The f-component is intended to be interpreted by the client as a
specification for a location within, or region of, the named
resource. It distinguishes the constituent parts of a resource named
by a URN. For a URN that resolves to one or more locators that can
be dereferenced to a representation, or where the URN resolver
directly returns a representation of the resource, the semantics of
an f-component are defined by the media type of the representation.

Using this mechanism from the RFC that defines URNs, the above example would become this:

<Genre href="urn:tva:metadata:cs:ContentCS:2011#3.6.8.18"/>
<Genre href="urn:tva:metadata:cs:ContentCS:2011#3.8.7.6"/>
<Genre href="urn:tva:metadata:cs:ContentCS:2011#3.2.14.6"/>
<Genre href="urn:tva:metadata:cs:FormatCS:2011#2.1.4"/>

Caveat: RFC 8141 extends URN syntax to explicitly allow the characters /, ?, and #, which were reserved for future use by RFC 2141 (the original URN specification, which is obsoleted by RFC 8141 since April 2017). Depending on which of the two RFCs is being referenced in our DVB spec, it may seem more, or less appropriate to use the "modern" way with #f-component. Or there may be an incentive to update the normative reference in the DVB spec to be to RFC 8141... 😉

@paulhiggs
Copy link
Collaborator

paulhiggs commented Feb 9, 2022

DVB-I has used a colon (:) to separate the classification scheme URI from the classification scheme term. Shifting to a "#' at this time would be difficult, but at least it should be a colon after the CS URI.

Note also that TV Anytime (ETSI TS 102 822-3-1 v1.11.2) says in Annex A.1 "Examples of the use of the different forms of pointers to classification schemes and associated aliases are provided in clause 7.4.4.5.2 of the ISO/IEC 15938-5 [2]. It is to be noted that in the case of URNs, the separator to be used is the ":", while it is the "#" in the case of URLs."

@c-alpha
Copy link
Author

c-alpha commented Feb 10, 2022

DVB-I has used a colon (:) to separate the classification scheme URI from the classification scheme term. Shifting to a "#' at this time would be difficult, but at least it should be a colon after the CS URI.

Fair enough.

Do you think there should be a Bugzilla against the spec, too, so a potential, future switch could/would be discussed?

Note also that TV Anytime (ETSI TS 102 822-3-1 v1.11.2) says in Annex A.1 "Examples of the use of the different forms of pointers to classification schemes and associated aliases are provided in clause 7.4.4.5.2 of the ISO/IEC 15938-5 [2]. It is to be noted that in the case of URNs, the separator to be used is the ":", while it is the "#" in the case of URLs."

I checked the 2016 version of TS 102 822-3-1 (v1.9.2), and that has the text you quote already. I.e. this TV-A spec text predates RFC 8141 (which was published in 2017). Given the history of TV-A, I would conclude that IMHO one can not infer from the text you quote, that TV-A would not have adopted the # from RFC 8141, had they been aware of it. Just my two cents anyway.

@paulhiggs
Copy link
Collaborator

Do you think there should be a Bugzilla against the spec, too, so a potential, future switch could/would be discussed?

Yes, that is the only way for it to be discussed and have a spec update occur.

I checked the 2016 version of TS 102 822-3-1 (v1.9.2), and that has the text you quote already. I.e. this TV-A spec text predates RFC 8141 (which was published in 2017). Given the history of TV-A, I would conclude that IMHO one can not infer from the text you quote, that TV-A would not have adopted the # from RFC 8141, had they been aware of it. Just my two cents anyway.

I will open a separate Bugzilla issue against TV Anytime to consider RFC8141

@c-alpha
Copy link
Author

c-alpha commented Feb 11, 2022

[...]
I will open a separate Bugzilla issue against TV Anytime to consider RFC8141

It's all taken care of then; marvellous, thanks!

Happy for you to close this one (#9) whenever you deem it solved.

@paulhiggs
Copy link
Collaborator

Has the API been corrected to put a colon (:) between the classification scheme URI part (which includes the year as a 'version') and the term (which is dotted decimal)

@juhajoki
Copy link
Collaborator

@sofia-tsa to be checked

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants