diff --git a/index.html b/index.html index a764762..fd776ef 100644 --- a/index.html +++ b/index.html @@ -24,6 +24,14 @@

Editor's drafts for main branch of introduce-ai

+ + + + + + +
CBOR EDN: Literals and ABNFplain textdiff with main

Preview for branch appendix-normative

diff --git a/introduce-ai/draft-ietf-cbor-edn-literals.html b/introduce-ai/draft-ietf-cbor-edn-literals.html new file mode 100644 index 0000000..0af5451 --- /dev/null +++ b/introduce-ai/draft-ietf-cbor-edn-literals.html @@ -0,0 +1,2817 @@ + + + + + + +CBOR Extended Diagnostic Notation (EDN): Application-Oriented Literals, ABNF, and Media Type + + + + + + + + + + + + +
+ + + + + + + + + + +
Internet-DraftCBOR EDN: Literals and ABNFFebruary 2024
BormannExpires 4 August 2024[Page]
+
+
+
+
Workgroup:
+
Network Working Group
+
Internet-Draft:
+
draft-ietf-cbor-edn-literals-latest
+
Published:
+
+ +
+
Intended Status:
+
Informational
+
Expires:
+
+
Author:
+
+
+
C. Bormann
+
Universität Bremen TZI
+
+
+
+
+

CBOR Extended Diagnostic Notation (EDN): Application-Oriented Literals, ABNF, and Media Type

+
+

Abstract

+

The Concise Binary Object Representation, CBOR (STD 94, RFC 8949), defines a "diagnostic notation" in order to +be able to converse about CBOR data items without having to resort to +binary data.

+

​This document specifies how to add application-oriented extensions to +the diagnostic notation. It then defines two such extensions for +text representations of epoch-based date/times and of IP addresses +and prefixes (RFC 9164).

+

A few further additions close some gaps in usability. + To facilitate tool interoperation, this document + specifies a formal ABNF definition for extended diagnostic notation (EDN) + that accommodates application-oriented literals.

+
+
+

+About This Document +

+

This note is to be removed before publishing as an RFC.

+

+ The latest revision of this draft can be found at https://cbor-wg.github.io/edn-literal/. + Status information for this document may be found at https://datatracker.ietf.org/doc/draft-ietf-cbor-edn-literals/.

+

+ Discussion of this document takes place on the + cbor Working Group mailing list (mailto:cbor@ietf.org), + which is archived at https://mailarchive.ietf.org/arch/browse/cbor/. + Subscribe at https://www.ietf.org/mailman/listinfo/cbor/.

+

Source for this draft and an issue tracker can be found at + https://github.com/cbor-wg/edn-literal.

+
+
+
+

+Status of This Memo +

+

+ This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79.

+

+ Internet-Drafts are working documents of the Internet Engineering Task + Force (IETF). Note that other groups may also distribute working + documents as Internet-Drafts. The list of current Internet-Drafts is + at https://datatracker.ietf.org/drafts/current/.

+

+ Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress."

+

+ This Internet-Draft will expire on 4 August 2024.

+
+
+ +
+
+

+Table of Contents +

+ +
+
+
+
+

+1. Introduction +

+

For the Concise Binary Object Representation, CBOR, +Section 8 of RFC 8949 [STD94] in conjunction with Appendix G of [RFC8610] +defines a "diagnostic notation" in order to +be able to converse about CBOR data items without having to resort to +binary data. +Diagnostic notation syntax is based on JSON, with extensions +for representing CBOR constructs such as binary data and tags. +(Standardizing this together with the actual interchange format does +not serve to create another interchange format, but enables the use of +a shared diagnostic notation in tools for and in documents about CBOR.)

+

This document specifies how to add application-oriented extensions to +the diagnostic notation. It then defines two such extensions for +text representations of epoch-based date/times and of IP addresses +and prefixes [RFC9164].

+

A few further additions close some gaps in usability. + To facilitate tool interoperation, this document + specifies a formal ABNF definition for extended diagnostic notation (EDN) + that accommodates application-oriented literals. (See Appendix A.1 for an overall ABNF grammar as well as the +ABNF definitions in Appendix A.2 for grammars for both the +byte string presentations predefined in [STD94] and the application-extensions).

+

In addition, this document finally registers a media type identifier +and a content-format for CBOR diagnostic notation. This does not +elevate its status as an interchange format, but recognizes that +interaction between tools is often smoother if media types can be used.

+
+
+

+1.1. Terminology +

+

Section 8 of RFC 8949 [STD94] defines the original CBOR diagnostic notation, +and Appendix G of [RFC8610] supplies a number of extensions to the +diagnostic notation that result in the Extended Diagnostic Notation +(EDN). +The diagnostic notation extensions include popular features such as +embedded CBOR (encoded CBOR data items in byte strings) and comments. +A simple diagnostic notation extension that enables representing CBOR +sequences was added in Section 4.2 of [RFC8742]. +As diagnostic notation is not used in the kind of interchange +situations where backward compatibility would pose a significant +obstacle, there is little point in not using these extensions.

+

Therefore, when we refer to "diagnostic notation", we mean to +include the original notation from Section 8 of RFC 8949 [STD94] as well as the +extensions from Appendix G of [RFC8610], Section 4.2 of [RFC8742], and the +present document. +However, we stick to the abbreviation "EDN" as it has become quite +popular and is more sharply distinguishable from other meanings than +"DN" would be.

+

In a similar vein, the term "ABNF" in this document refers to the +language defined in [STD68] as extended in [RFC7405], where the +"characters" of Section 2.3 of RFC 5234 [STD68] are Unicode scalar values. +The term "CDDL" refers to the data definition language defined in +[RFC8610] and its registered extensions (such as those in [RFC9165]), as +well as [I-D.ietf-cbor-update-8610-grammar].

+

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", +"MAY", and "OPTIONAL" in this document are to be interpreted as +described in BCP 14 [RFC2119] [RFC8174] when, and only when, they +appear in all capitals, as shown here.

+
+
+
+
+

+1.2. (Non-)Objectives of this Document +

+

Section 8 of RFC 8949 [STD94] states the objective of defining a +human-readable diagnostic notation with CBOR. +In particular, it states:

+
+

All actual interchange always happens in the binary format.

+
+

One important application of EDN is the notation of CBOR data for +humans: in specifications, on whiteboards, and for entering test data. +A number of features, such as comments in string literals, are mainly +useful for people-to-people communication via EDN. +Programs also often output EDN for diagnostic purposes, such as in +error messages or to enable comparison (including generation of diffs +via tools) with test data.

+

For comparison with test data, it is often useful if different +implementations generate the same (or similar) output for the same +CBOR data items. +This is comparable to the objectives of deterministic serialization +for CBOR data items themselves (Section 4.2 of RFC 8949 [STD94]). +However, there are even more representation variants in EDN than in +binary CBOR, and there is little point in specifically endorsing a +single variant as "deterministic" when other variants may be more +useful for human understanding, e.g., the << >> notation as +opposed to h''; an EDN generator may have quite a few options +that control what presentation variant is most desirable for the +application that it is being used for.

+

Because of this, a deterministic representation is not defined for +EDN, and there is no expectation for "roundtripping" from EDN to +CBOR and back, i.e., for an ability +to convert EDN to binary CBOR and back to EDN while achieving exactly +the same result as the original input EDN — the original EDN possibly +was created by humans or by a different EDN generator.

+

However, there is a certain expectation that EDN generators can be +configured to some basic output format, which:

+
    +
  • +

    looks like JSON where that is possible;

    +
  • +
  • +

    inserts encoding indicators only where the binary form differs from +preferred encoding;

    +
  • +
  • +

    uses hexadecimal representation (h'') for byte strings, not +b64'' or embedded CBOR (<<>>);

    +
  • +
  • +

    does not generate elaborate blank space (newlines, indentation) for +pretty-printing, but does use common blank spaces such as after , +and :.

    +
  • +
+

Additional features such as ensuring deterministic map ordering +(Section 4.2 of RFC 8949 [STD94]) on output, or even deviating from the basic +configuration in some systematic way, can further assist in comparing +test data. +Information obtained from a CDDL model can help in choosing +application-oriented literals or specific string representations such +as embedded CBOR or b64'' in the appropriate places.

+
+
+
+
+
+
+

+2. Application-Oriented Extension Literals +

+

This document extends the syntax used in diagnostic notation for byte +string literals to also be available for application-oriented extensions.

+

As per Section 8 of RFC 8949 [STD94], the diagnostic notation can notate byte +strings in a number of [RFC4648] base encodings, where the encoded text +is enclosed in single quotes, prefixed by an identifier (»h« for +base16, »b32« for base32, »h32« for base32hex, »b64« for base64 or +base64url).

+

This syntax can be thought to establish a name space, with the names +"h", "b32", "h32", and "b64" taken, but other names being unallocated. +The present specification defines additional names for this namespace, +which we call application-extension identifiers. +For the quoted string, the same rules apply as for byte strings. +In particular, the escaping rules that were adapted from JSON strings +are applied +equivalently for application-oriented extensions, e.g., within the +quoted string \\ stands +for a single backslash and \' stands for a single quote.

+

An application-extension identifier is a name consisting of a +lower-case ASCII letter (a-z) and zero or more additional ASCII +characters that are either lower-case letters or digits (a-z0-9).

+

Application-extension identifiers are registered in a registry +(Section 4.1).

+

Prefixing a single-quoted string, an application-extension identifier +is used to build an application-oriented extension literal, which +stands for a CBOR data item the value of which is derived from the +text given in the single-quoted string using a procedure defined in +the specification for an application-extension identifier.

+

An application-extension (such as dt) MAY also define the meaning of +a variant of the application-extension identifier where each +lower-case character is replaced by its upper-case counterpart (such +as DT), for building an application-oriented extension literal using +that all-uppercase variant as the prefix of a single-quoted string.

+

As a convention for such definitions, using the all-uppercase variant +implies making use of a tag appropriate for this application-oriented +extension (such as tag number 1 for DT).

+

Examples for application-oriented extensions to CBOR diagnostic +notation can be found in the following sections.

+
+
+

+2.1. The "dt" Extension +

+

The application-extension identifier "dt" is used to notate a +date/time literal that can be used as an Epoch-Based Date/Time as per +Section 3.4.2 of RFC 8949 [STD94].

+

The text of the literal is a Standard Date/Time String as per +Section 3.4.1 of RFC 8949 [STD94].

+

The value of the literal is a number representing the result of a +conversion of the given Standard Date/Time String to an Epoch-Based +Date/Time. +If fractional seconds are given in the text (production +time-secfrac in Figure 4), the value is a +floating-point number; the value is an integer number otherwise. +In the all-upper-case variant of the app-prefix, the value is enclosed +in a tag number 1.

+

As an example, the CBOR diagnostic notation

+
+
+dt'1969-07-21T02:56:16Z',
+dt'1969-07-21T02:56:16.5Z',
+DT'1969-07-21T02:56:16Z'
+
+
+

is equivalent to

+
+
+-14159024,
+-14159023.5,
+1(-14159024)
+
+
+

See Appendix A.2.3 for an ABNF definition for the content of dt literals.

+
+
+
+
+

+2.2. The "ip" Extension +

+

The application-extension identifier "ip" is used to notate an IP +address literal that can be used as an IP address as per Section 3 of [RFC9164].

+

The text of the literal is an IPv4address or IPv6address as per +Section 3.2.2 of [RFC3986].

+

With the lower-case app-string ip, the value of the literal is a +byte string representing the binary IP address. +With the upper-case app-string IP, the literal is such a byte string +tagged with tag number 54, if an IPv6address is used, or tag number +52, if an IPv4address is used.

+

As an additional case, the upper-case app-string IP'' can be used +with a prefix such as 2001:db8::/56 or 192.0.2.0/24, with the equivalent tag as its value. +(Note that [RFC9164] representations of address prefixes need to +implement the truncation of the address byte string as described in +Section 4.2 of [RFC9164]; see example below.) +For completeness, the lower-case variant ip'2001:db8::/56' or ip'192.0.2.0/24' stands for +an unwrapped [56,h'20010db8'] or [24,h'c00002']; however, in this case the information +on whether an address is IPv4 or IPv6 often needs to come from the context.

+

Note that there is no direct representation of an address combined +with a prefix length; this can be represented as +52([ip'192.0.2.42',24]), if needed.

+

Examples: the CBOR diagnostic notation

+
+
+ip'192.0.2.42',
+IP'192.0.2.42',
+IP'192.0.2.0/24',
+ip'2001:db8::42',
+IP'2001:db8::42',
+IP'2001:db8::/64'
+
+
+

is equivalent to

+
+
+h'c000022a',
+52(h'c000022a'),
+52([24,h'c00002']),
+h'20010db8000000000000000000000042',
+54(h'20010db8000000000000000000000042'),
+54([64,h'20010db8'])
+
+
+

See Appendix A.2.4 for an ABNF definition for the content of ip literals.

+
+
+
+
+
+
+

+3. Stand-in Representations in Binary CBOR +

+

In some cases, an EDN consumer cannot construct actual CBOR items that +represent the CBOR data intended for eventual interchange. +This document defines stand-in representation for two such cases:

+ +
+
+

+3.1. Handling unknown application-extension identifiers +

+

When ingesting CBOR diagnostic notation, any +application-oriented extension literals are usually decoded and +transformed into the corresponding data item during ingestion. +If an application-extension is not known or not implemented by the +ingesting process, this is usually an error and processing has to +stop.

+

However, in certain cases, it can be desirable to exceptionally carry an +uninterpreted application-oriented extension literal in an ingested +data item, allowing to postpone its decoding to a specific later +stage of ingestion.

+

This specification defines a CBOR Tag for this purpose: +The Diagnostic Notation Unresolved Application-Extension Tag, tag +number CPA999 (Section 4.5). +The content of this tag is an array of two text strings: The +application-extension identifier, and the (escape-processed) content +of the single-quoted string. +For example, dt'1969-07-21T02:56:16Z' can be provisionally represented as +/CPA/ 999(["dt", "1969-07-21T02:56:16Z"]).

+

RFC-Editor: This document uses the CPA (code point allocation) + convention described in [I-D.bormann-cbor-draft-numbers]. For + each usage of the term "CPA", please remove the prefix "CPA" + from the indicated value and replace the residue with the value + assigned by IANA; perform an analogous substitution for all other + occurrences of the prefix "CPA" in the document. Finally, + please remove this note.

+
+
+
+
+

+3.2. Handling information deliberately elided from an EDN document +

+

When using EDN for exposition in a document or on a whiteboard, it is +often useful to be able to leave out parts of an EDN document that are +not of interest at that point of the exposition.

+

To facilitate this, this specification +supports the use of an ellipsis (notated as three or more dots +in a row, as in ...) to indicate parts of an EDN document that have +been elided (and therefore cannot be reconstructed).

+

Upon ingesting EDN as a representation of a CBOR data item for further +processing, the occurrence of an ellipsis usually is an error and +processing has to stop.

+

However, it is useful to be able to process EDN documents with +ellipses in the automation scripts for the documents using them. +This specification defines a CBOR Tag that can be used in the ingestion +for this purpose: +The Diagnostic Notation Ellipsis Tag, tag number CPA888 (Section 4.5). +The content of this tag either is

+
    +
  1. +

    null (indicating a data item entirely replaced by an ellipsis), or it is

    +
  2. +
  3. +

    an array, the elements of which are alternating between fragments +of a string and the actual elisions, represented as ellipses +carrying a null as content.

    +
  4. +
+

Elisions can stand in for entire subtrees, e.g. in:

+
+
+[1, 2, ..., 3]
+,
+{ "a": 1,
+  "b": ...,
+  ...: ...
+}
+
+
+

A single ellipsis (or key/value pair of ellipses) can imply eliding +multiple elements in an array (members in a map); if more detailed +control is required, a data definition language such as CDDL can be +employed. +(Note that the stand-in form defined here does not allow multiple +key/value pairs with an ellipsis as a key: the CBOR data item would +not be valid.)

+

Subtree elisions can be represented in a CBOR data item by using +/CPA/888(null) as the stand-in:

+
+
+[1, 2, 888(null), 3]
+,
+{ "a": 1,
+  "b": 888(null),
+  888(null): 888(null)
+}
+
+
+

Elisions also can be used as part of a (text or byte) string:

+
+
+{ "contract": "Herewith I buy" ... "gned: Alice & Bob",
+  "signature": h'4711...0815',
+}
+
+
+

The example "contract" uses string concatenation as per Appendix G.4 of [RFC8610], extending that by allowing ellipses; while the example +"signature" uses special syntax that allows the use of ellipses +between the bytes notated inside h'' literals.

+

String elisions can be represented in a CBOR data item by a stand-in +that wraps an array of string fragments alternating with ellipsis +indicators:

+
+
+{ "contract": /CPA/888(["Herewith I buy", 888(null),
+                        "gned: Alice & Bob"]),
+  "signature": 888([h'4711', 888(null), h'0815']),
+}
+
+
+

Note that the use of elisions is different from "commenting out" EDN +text, e.g.

+
+
+{ "contract": "Herewith I buy" /.../ "gned: Alice & Bob",
+  "signature": h'4711/.../0815',
+  # ...: ...
+}
+
+
+

The consumer of this EDN will ignore the comments and therefore will +have no idea after ingestion that some information has been elided; +validation steps may then simply fail instead of being informed about +the elisions.

+
+
+
+
+
+
+

+4. IANA Considerations +

+

RFC Editor: please replace RFC-XXXX with the RFC +number of this RFC, [IANA.cbor-diagnostic-notation] with a +reference to the new registry group, and remove this note.

+
+
+

+4.1. CBOR Diagnostic Notation Application-extension Identifiers Registry +

+

IANA is requested to create an "Application-Extension Identifiers" +registry in a new "CBOR Diagnostic Notation" registry group +[IANA.cbor-diagnostic-notation], with the policy "expert review" +(Section 4.5 of RFC 8126 [BCP26]).

+
+

The experts are instructed to be frugal in the allocation of +application-extension identifiers that are suggestive of generally applicable semantics, +keeping them in reserve for application-extensions that are likely to enjoy wide +use and can make good use of their conciseness. +The expert is also instructed to direct the registrant to provide a +specification (Section 4.6 of RFC 8126 [BCP26]), but can make exceptions, +for instance when a specification is not available at the time of +registration but is likely forthcoming. +If the expert becomes aware of application-extension identifiers that are deployed and +in use, they may also initiate a registration on their own if +they deem such a registration can avert potential future collisions.

+
+

Each entry in the registry must include:

+
+
Application-Extension Identifier:
+
+

a lower case ASCII [STD80] string that starts with a letter and can +contain letters and digits after that ([a-z][a-z0-9]*). No other +entry in the registry can have the same application-extension identifier.

+
+
+
Description:
+
+

a brief description

+
+
+
Change Controller:
+
+

(see Section 2.3 of RFC 8126 [BCP26])

+
+
+
Reference:
+
+

a reference document that provides a description of the +application-extension identifier

+
+
+
+

The initial content of the registry is shown in Table 1; all +initial entries have the Change Controller "IETF".

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+Table 1: +Initial Content of Application-extension Identifier Registry +
Application-extension IdentifierDescriptionReference
hReservedRFC8949
b32ReservedRFC8949
h32ReservedRFC8949
b64ReservedRFC8949
dtDate/TimeRFC-XXXX
ipIP Address/PrefixRFC-XXXX
+
+
+
+
+
+

+4.2. Encoding Indicators +

+

IANA is requested to create an "Encoding Indicators" +registry in the newly created "CBOR Diagnostic Notation" registry group +[IANA.cbor-diagnostic-notation], with the policy "specification required" +(Section 4.6 of RFC 8126 [BCP26]).

+
+

The experts are instructed to be frugal in the allocation of +encoding indicators that are suggestive of generally applicable semantics, +keeping them in reserve for encoding indicator registrations that are likely to enjoy wide +use and can make good use of their conciseness. +If the expert becomes aware of encoding indicators that are deployed and +in use, they may also solicit a specification and initiate a registration on their own if +they deem such a registration can avert potential future collisions.

+
+

Each entry in the registry must include:

+
+
Encoding Indicator:
+
+

an ASCII [STD80] string that starts with an underscore letter and +can contain zero or more underscores, letters and digits after that +(_[_A-Za-z0-9]*). No other entry in the registry can have the same +Encoding Indicator.

+
+
+
Description:
+
+

a brief description. +This description may employ an abbreviation of the form ai=nn, +where nn is the numeric value of the field additional information, the +low-order 5 bits of the initial byte (see Section 3 of RFC 8949 [STD94]).

+
+
+
Change Controller:
+
+

(see Section 2.3 of RFC 8126 [BCP26])

+
+
+
Reference:
+
+

a reference document that provides a description of the +application-extension identifier

+
+
+
+

The initial content of the registry is shown in Table 2; all +initial entries have the Change Controller "IETF".

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+Table 2: +Initial Content of Encoding Indicator Registry +
Encoding IndicatorDescriptionReference
_Indefinite Length Encoding (ai=31)RFC8949, RFC-XXXX
_iai=0 to ai=23RFC-XXXX
_0ai=24RFC8949, RFC-XXXX
_1ai=25RFC8949, RFC-XXXX
_2ai=26RFC8949, RFC-XXXX
_3ai=27RFC8949, RFC-XXXX
+
+
+
+
+
+

+4.3. Media Type +

+

IANA is requested to add the following Media-Type to the "Media Types" +registry [IANA.media-types].

+
+ + + + + + + + + + + + + + + + +
+Table 3: +New Media Type application/cbor-diagnostic +
NameTemplateReference
cbor-diagnosticapplication/cbor-diagnosticRFC-XXXX, Section 4.3 +
+
+
+
Type name:
+
+

application

+
+
+
Subtype name:
+
+

cbor-diagnostic

+
+
+
Required parameters:
+
+

N/A

+
+
+
Optional parameters:
+
+

N/A

+
+
+
Encoding considerations:
+
+

binary (UTF-8)

+
+
+
Security considerations:
+
+

Section 5 of RFC XXXX

+
+
+
Interoperability considerations:
+
+

none

+
+
+
Published specification:
+
+

Section 4.3 of RFC XXXX

+
+
+
Applications that use this media type:
+
+

Tools interchanging a human-readable form of CBOR

+
+
+
Fragment identifier considerations:
+
+

The syntax and semantics of fragment identifiers is as specified for +"application/cbor". (At publication of RFC XXXX, there is no +fragment identification syntax defined for "application/cbor".)

+
+
+
Additional information:
+
+


+
+
Deprecated alias names for this type:
+
+

N/A

+
+
+
Magic number(s):
+
+

N/A

+
+
+
File extension(s):
+
+

.diag

+
+
+
Macintosh file type code(s):
+
+

N/A

+
+
+
+
+
+
Person & email address to contact for further information:
+
+

CBOR WG mailing list (cbor@ietf.org), +or IETF Applications and Real-Time Area (art@ietf.org)

+
+
+
Intended usage:
+
+

LIMITED USE

+
+
+
Restrictions on usage:
+
+

CBOR diagnostic notation represents CBOR data items, which are the +format intended for actual interchange. +The media type application/cbor-diagnostic is intended to be used +within documents about CBOR data items, in diagnostics for human +consumption, and in other representations of CBOR data items that +are necessarily text-based such as in configuration files or other +data edited by humans, often under source-code control.

+
+
+
Author/Change controller:
+
+

IETF

+
+
+
Provisional registration:
+
+

no

+
+
+
+
+
+
+
+

+4.4. Content-Format +

+

IANA is requested to register a Content-Format number in the +"CoAP Content-Formats" +sub-registry, within the "Constrained RESTful Environments (CoRE) +Parameters" Registry [IANA.core-parameters], as follows:

+ + + + + + + + + + + + + + + + + + +
+Table 4: +New Content-Format +
Content-TypeContent CodingIDReference
application/cbor-diagnostic-TBD1RFC-XXXX
+

TBD1 is to be assigned from the space 256..999.

+
+
+
+
+

+4.5. Stand-in Tags +

+

RFC-Editor: This document uses the CPA (code point allocation) + convention described in [I-D.bormann-cbor-draft-numbers]. For + each usage of the term "CPA", please remove the prefix "CPA" + from the indicated value and replace the residue with the value + assigned by IANA; perform an analogous substitution for all other + occurrences of the prefix "CPA" in the document. Finally, + please remove this note.

+

In the "CBOR Tags" registry [IANA.cbor-tags], IANA is requested to assign the +tags in Table 5 from the "specification required" space +(suggested assignments: 888 and 999), with the present document as the +specification reference.

+
+ + + + + + + + + + + + + + + + + + + + + + + + +
+Table 5: +Values for Tags +
TagData ItemSemanticsReference
CPA888null or arrayDiagnostic Notation EllipsisRFC-XXXX
CPA999arrayDiagnostic Notation
Unresolved Application-Extension
RFC-XXXX
+
+
+
+
+
+
+
+

+5. Security considerations +

+

The security considerations of [STD94] and [RFC8610] apply.

+
+
+
+

+6. References +

+
+
+

+6.1. Normative References +

+
+
[BCP26]
+
+
Best Current Practice 26, <https://www.rfc-editor.org/info/bcp26>.
At the time of writing, this BCP comprises the following: +
+
+ Cotton, M., Leiba, B., and T. Narten, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 8126, DOI 10.17487/RFC8126, , <https://www.rfc-editor.org/info/rfc8126>.
+
+
+
[C]
+
+International Organization for Standardization, "Information technology — Programming languages — C", Fourth Edition, ISO/IEC 9899:2018, , <https://www.iso.org/standard/74528.html>. The text of the standard is also available via https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf +
+
+
[Cplusplus]
+
+International Organization for Standardization, "Programming languages — C++", Sixth Edition, ISO/IEC 14882:2020, , <https://www.iso.org/standard/79358.html>. The text of the standard is also available via https://isocpp.org/files/papers/N4860.pdf +
+
+
[IANA.cbor-tags]
+
+IANA, "Concise Binary Object Representation (CBOR) Tags", <http://www.iana.org/assignments/cbor-tags>.
+
+
[IANA.core-parameters]
+
+IANA, "Constrained RESTful Environments (CoRE) Parameters", <http://www.iana.org/assignments/core-parameters>.
+
+
[IANA.media-types]
+
+IANA, "Media Types", <http://www.iana.org/assignments/media-types>.
+
+
[IEEE754]
+
+IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE Std 754-2019, DOI 10.1109/IEEESTD.2019.8766229, <https://ieeexplore.ieee.org/document/8766229>.
+
+
[RFC2119]
+
+Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
+
+
[RFC3339]
+
+Klyne, G. and C. Newman, "Date and Time on the Internet: Timestamps", RFC 3339, DOI 10.17487/RFC3339, , <https://www.rfc-editor.org/rfc/rfc3339>.
+
+
[RFC3986]
+
+Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, DOI 10.17487/RFC3986, , <https://www.rfc-editor.org/rfc/rfc3986>.
+
+
[RFC7405]
+
+Kyzivat, P., "Case-Sensitive String Support in ABNF", RFC 7405, DOI 10.17487/RFC7405, , <https://www.rfc-editor.org/rfc/rfc7405>.
+
+
[RFC8174]
+
+Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <https://www.rfc-editor.org/rfc/rfc8174>.
+
+
[RFC8610]
+
+Birkholz, H., Vigano, C., and C. Bormann, "Concise Data Definition Language (CDDL): A Notational Convention to Express Concise Binary Object Representation (CBOR) and JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610, , <https://www.rfc-editor.org/rfc/rfc8610>.
+
+
[RFC8742]
+
+Bormann, C., "Concise Binary Object Representation (CBOR) Sequences", RFC 8742, DOI 10.17487/RFC8742, , <https://www.rfc-editor.org/rfc/rfc8742>.
+
+
[RFC9164]
+
+Richardson, M. and C. Bormann, "Concise Binary Object Representation (CBOR) Tags for IPv4 and IPv6 Addresses and Prefixes", RFC 9164, DOI 10.17487/RFC9164, , <https://www.rfc-editor.org/rfc/rfc9164>.
+
+
[STD68]
+
+
Internet Standard 68, <https://www.rfc-editor.org/info/std68>.
At the time of writing, this STD comprises the following: +
+
+ Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, DOI 10.17487/RFC5234, , <https://www.rfc-editor.org/info/rfc5234>.
+
+
+
[STD80]
+
+
Internet Standard 80, <https://www.rfc-editor.org/info/std80>.
At the time of writing, this STD comprises the following: +
+
+ Cerf, V., "ASCII format for network interchange", STD 80, RFC 20, DOI 10.17487/RFC0020, , <https://www.rfc-editor.org/info/rfc20>.
+
+
+
[STD94]
+
+
Internet Standard 94, <https://www.rfc-editor.org/info/std94>.
At the time of writing, this STD comprises the following: +
+
+ Bormann, C. and P. Hoffman, "Concise Binary Object Representation (CBOR)", STD 94, RFC 8949, DOI 10.17487/RFC8949, , <https://www.rfc-editor.org/info/rfc8949>.
+
+
+
+
+
+
+
+

+6.2. Informative References +

+
+
[I-D.ietf-cbor-update-8610-grammar]
+
+Bormann, C., "Updates to the CDDL grammar of RFC 8610", Work in Progress, Internet-Draft, draft-ietf-cbor-update-8610-grammar-04, , <https://datatracker.ietf.org/doc/html/draft-ietf-cbor-update-8610-grammar-04>.
+
+
[RFC4648]
+
+Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, DOI 10.17487/RFC4648, , <https://www.rfc-editor.org/rfc/rfc4648>.
+
+
[RFC9165]
+
+Bormann, C., "Additional Control Operators for the Concise Data Definition Language (CDDL)", RFC 9165, DOI 10.17487/RFC9165, , <https://www.rfc-editor.org/rfc/rfc9165>.
+
+
[STD90]
+
+
Internet Standard 90, <https://www.rfc-editor.org/info/std90>.
At the time of writing, this STD comprises the following: +
+
+ Bray, T., Ed., "The JavaScript Object Notation (JSON) Data Interchange Format", STD 90, RFC 8259, DOI 10.17487/RFC8259, , <https://www.rfc-editor.org/info/rfc8259>.
+
+
+
+
+
+
+
+
+

+Appendix A. ABNF Definitions +

+

This appendix collects grammars in ABNF form ([STD68] as extended in +[RFC7405]) that serve to define the syntax of EDN and some +application-oriented literals.

+

Implementation note: The ABNF definitions in this appendix are +intended to be useful in a PEG parser interpretation (see Appendix A of [RFC8610] for an introduction into PEG).

+
+
+

+A.1. Overall ABNF Definition for Extended Diagnostic Notation +

+

This appendix provides an overall ABNF definition for the syntax of +CBOR extended diagnostic notation.

+

To complete the parsing of an app-string with prefix, say, p, the +processed sqstr inside it is further parsed using the ABNF definition specified +for the production app-string-p in Appendix A.2.

+

For simplicity, the internal parsing for the built-in EDN prefixes is +specified in the same way. +ABNF definitions for h'' and b64'' are provided in Appendix A.2.1 and +Appendix A.2.2. +However, the prefixes b32'' and h32'' are not in wide use and an +ABNF definition in this document could therefore not be based on +implementation experience.

+
+
+
+
+seq             = S [item S *("," S item S) OC] S
+one-item        = S item S
+item            = map / array / tagged
+                / number / simple
+                / string / streamstring
+
+string1         = (tstr / bstr) spec
+string1e        = string1 / ellipsis
+ellipsis        = 3*"." ; "..." or more dots
+string          = string1e *(S string1e)
+
+number          = (basenumber / decnumber / infin) spec
+sign            = "+" / "-"
+decnumber       = [sign] (1*DIGIT ["." *DIGIT] / "." 1*DIGIT)
+                         ["e" [sign] 1*DIGIT]
+basenumber      = [sign] "0" ("x" 1*HEXDIG
+                              [["." *HEXDIG] "p" [sign] 1*DIGIT]
+                            / "x" "." 1*HEXDIG "p" [sign] 1*DIGIT
+                            / "o" 1*ODIGIT
+                            / "b" 1*BDIGIT)
+infin           = %s"Infinity"
+                / %s"-Infinity"
+                / %s"NaN"
+simple          = %s"false"
+                / %s"true"
+                / %s"null"
+                / %s"undefined"
+                / %s"simple(" S item S ")"
+uint            = "0" / DIGIT1 *DIGIT
+tagged          = uint spec "(" S item S ")"
+
+app-prefix      = lcalpha *lcalnum ; including h and b64
+                / ucalpha *ucalnum ; tagged variant, if defined
+app-string      = app-prefix sqstr
+sqstr           = "'" *single-quoted "'"
+bstr            = app-string / sqstr / embedded
+                  ; app-string could be any type
+tstr            = DQUOTE *double-quoted DQUOTE
+embedded        = "<<" seq ">>"
+
+array           = "[" spec S [item S *("," S item S) OC] "]"
+map             = "{" spec S [kp S *("," S kp S) OC] "}"
+kp              = item S ":" S item
+
+; We allow %x09 HT in prose, but not in strings
+blank           = %x09 / %x0A / %x0D / %x20
+non-slash       = blank / %x21-2e / %x30-D7FF / %xE000-10FFFF
+non-lf          = %x09 / %x0D / %x20-D7FF / %xE000-10FFFF
+S               = *blank *(comment *blank)
+comment         = "/" *non-slash "/"
+                / "#" *non-lf %x0A
+
+; optional trailing comma (ignored)
+OC              = ["," S]
+
+; check semantically that strings are either all text or all bytes
+; note that there must be at least one string to distinguish
+streamstring    = "(_" S string S *("," S string S) OC ")"
+spec            = ["_" *wordchar]
+
+double-quoted   = unescaped
+                / "'"
+                / "\" DQUOTE
+                / "\" escapable
+
+single-quoted   = unescaped
+                / DQUOTE
+                / "\" "'"
+                / "\" escapable
+
+escapable       = %s"b" ; BS backspace U+0008
+                / %s"f" ; FF form feed U+000C
+                / %s"n" ; LF line feed U+000A
+                / %s"r" ; CR carriage return U+000D
+                / %s"t" ; HT horizontal tab U+0009
+                / "/"   ; / slash (solidus) U+002F (JSON!)
+                / "\"   ; \ backslash (reverse solidus) U+005C
+                / (%s"u" hexchar) ;  uXXXX      U+XXXX
+
+hexchar         = "{" (1*"0" [ hexscalar ] / hexscalar) "}"
+                / non-surrogate
+                / (high-surrogate "\" %s"u" low-surrogate)
+non-surrogate   = ((DIGIT / "A"/"B"/"C" / "E"/"F") 3HEXDIG)
+                / ("D" ODIGIT 2HEXDIG )
+high-surrogate  = "D" ("8"/"9"/"A"/"B") 2HEXDIG
+low-surrogate   = "D" ("C"/"D"/"E"/"F") 2HEXDIG
+hexscalar       = "10" 4HEXDIG / HEXDIG1 4HEXDIG
+                / non-surrogate / 1*3HEXDIG
+
+; Note that no other C0 characters are allowed, including %x09 HT
+unescaped       = %x0A ; new line
+                / %x0D ; carriage return -- ignored on input
+                / %x20-21
+                     ; omit 0x22 "
+                / %x23-26
+                     ; omit 0x27 '
+                / %x28-5B
+                     ; omit 0x5C \
+                / %x5D-D7FF ; skip surrogate code points
+                / %xE000-10FFFF
+
+DQUOTE          = %x22    ; " double quote
+DIGIT           = %x30-39 ; 0-9
+DIGIT1          = %x31-39 ; 1-9
+ODIGIT          = %x30-37 ; 0-7
+BDIGIT          = %x30-31 ; 0-1
+HEXDIG          = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
+HEXDIG1         = DIGIT1 / "A" / "B" / "C" / "D" / "E" / "F"
+; Note: double-quoted strings as in "A" are case-insensitive in ABNF
+lcalpha         = %x61-7A ; a-z
+lcalnum         = lcalpha / DIGIT
+ucalpha         = %x41-5A ; A-Z
+ucalnum         = ucalpha / DIGIT
+wordchar        = "_" / lcalnum / ucalpha ; [_a-z0-9A-Z]
+
+
+
Figure 1
+
+

While an ABNF grammar defines the set of character strings that are +considered to be valid EDN by this ABNF, the mapping of these +character strings into the generic data model of CBOR is not always +obvious.

+

The following additional items should help in the interpretation:

+
    +
  • +

    decnumber stands for an integer in the usual decimal notation, unless at +least one of the optional parts starting with "." and "e" are +present, in which case it stands for a floating point value in the +usual decimal notation. Note that the grammar now allows 3. for +3.0 and .3 for 0.3 (also for hexadecimal floating point +below); implementers are advised that some platform numeric parsers +accept only a subset of the floating point syntax in this document +and may require some preprocessing to use here.

    +
  • +
  • +

    basenumber stands for an integer in the usual base 16/hexadecimal +("0x"), base 8/octal ("0o"), or base 2/binary ("0b") notation, unless the +optional part containing a "p" is present, in which case it stands +for a floating point number in the usual hexadecimal notation (which +uses a mantissa in hexadecimal and an exponent in decimal notation, +see Section 5.12.3 of [IEEE754], Section 6.4.4.2 of [C], or Section +5.13.4 of [Cplusplus]; floating-suffix/floating-point-suffix from +the latter two is not used here).

    +
  • +
  • +

    spec stands for an encoding indicator.

    +

    +(In the following, an abbreviation of the form ai=nn gives nn as +the numeric value of the field additional information, the low-order 5 +bits of the initial byte: see Section 3 of RFC 8949 [STD94].)

    +

    +As per Section 8.1 of RFC 8949 [STD94]:

    +
      +
    • +

      an underscore _ on its own stands +for indefinite length encoding (ai=31, only available behind the +opening brace/bracket for map and array: strings have a special +syntax streamstring for indefinite length encoding except for the +special cases ''_ and ""_), and

      +
    • +
    • +

      _0 to _3 stand for ai=24 to ai=27, respectively.

      +
    • +
    +

    +Surprisingly, Section 8.1 of RFC 8949 [STD94] does not address ai=0 to +ai=23 — the assumption seems to be that preferred serialization +(Section 4.1 of RFC 8949 [STD94]) will be used when converting CBOR +diagnostic notation to an encoded CBOR data item, so leaving out the +encoding indicator for a data item with a preferred serialization +will implicitly use ai=0 to ai=23 if that is possible. +The present specification allows to make this explicit:

    +
      +
    • +

      _i ("immediate") stands for encoding with ai=0 to ai=23.

      +
    • +
    +

    +While no pressing use for further values for encoding indicators +comes to mind, this is an extension point for EDN; Section 4.2 defines +a registry for additional values.

    +
  • +
  • +

    string and the rules preceding it in the same block realize both +the representation of strings that are split up into multiple chunks +(Section G.4 of RFC 8949 [STD94]) and the use of ellipses to represent elisions +(Section 3.2). The semantic processing of these rules is relatively +complex:

    +
      +
    • +

      A single ... is a general ellipsis, which can stand for any data +item.

      +
    • +
    • +

      An ellipsis can be surrounded (on one or both sides) by string +chunks, the result is a CBOR tag number CPA888 that contains an +array with joined together spans of such chunks plus the ellipses +represented by 888(null).

      +
    • +
    • +

      A simple sequence of string chunks is simply joined together. +In both cases of joining strings, the rules of Section G.4 of RFC 8949 [STD94] need to be followed; in particular, if a text string +results from the joining operation, that result needs to be valid +UTF-8.

      +
    • +
    • +

      Some of the strings may be app-strings. +If the type of the app-string is an actual string, joining of +chunked strings occurs as with directly notated strings; otherwise +the occurrence of more than one app-string or an app-string +together with a directly notated string cannot be processed.

      +
    • +
    +
  • +
+
+
+
+
+

+A.2. ABNF Definitions for app-string Content +

+

This appendix provides ABNF definitions for application-oriented extension +literals defined in [STD94] and in this specification. +These grammars describe the decoded content of the sqstr components that +combine with the application-extension identifiers to form +application-oriented extension literals. +Each of these may make use of rules defined in Figure 1.

+
+
+

+A.2.1. h: ABNF Definition of Hexadecimal representation of a byte string +

+

The syntax of the content of byte strings represented in hex, +such as h'', h'0815', or h'/head/ 63 /contents/ 66 6f 6f' +(another representation of << "foo" >>), is described by the ABNF in Figure 2. +This syntax accommodates both lower case and upper case hex digits, as +well as blank space (including comments) around each hex digit.

+
+
+
+
+app-string-h    = S *(HEXDIG S HEXDIG S / ellipsis S)
+                  ["#" *non-lf]
+ellipsis        = 3*"."
+HEXDIG          = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
+DIGIT           = %x30-39 ; 0-9
+blank           = %x09 / %x0A / %x0D / %x20
+non-slash       = blank / %x21-2e / %x30-10FFFF
+non-lf          = %x09 / %x0D / %x20-D7FF / %xE000-10FFFF
+S               = *blank *(comment *blank )
+comment         = "/" *non-slash "/"
+                / "#" *non-lf %x0A
+
+
+
Figure 2: +ABNF Definition of Hexadecimal Representation of a Byte String +
+
+
+
+
+
+

+A.2.2. b64: ABNF Definition of Base64 representation of a byte string +

+

The syntax of the content of byte strings represented in base64 is +described by the ABNF in Figure 2.

+

This syntax allows both the classic (Section 4 of [RFC4648]) and the +URL-safe (Section 5 of [RFC4648]) alphabet to be used. +It accommodates, but does not require base64 padding. +Note that inclusion of classic base64 makes it impossible to have +in-line comments in b64, as "/" is valid base64-classic.

+
+
+
+
+app-string-b64  = B *(4(b64dig B))
+                  [b64dig B b64dig B ["=" B "=" / b64dig B ["="]] B]
+                  ["#" *inon-lf]
+b64dig          = ALPHA / DIGIT / "-" / "_" / "+" / "/"
+B               = *iblank *(icomment *iblank)
+iblank          = %x0A / %x20  ; Not HT or CR (gone)
+icomment        = "#" *inon-lf %x0A
+inon-lf         = %x20-D7FF / %xE000-10FFFF
+ALPHA           = %x41-5a / %x61-7a
+DIGIT           = %x30-39
+
+
+
Figure 3: +ABNF definition of Base64 Representation of a Byte String +
+
+
+
+
+
+

+A.2.3. dt: ABNF Definition of RFC 3339 Representation of a Date/Time +

+

The syntax of the content of dt literals can be described by the +ABNF for date-time from [RFC3339] as summarized in Section 3 of [RFC9165]:

+
+
+
+
+app-string-dt   = date-time
+
+date-fullyear   = 4DIGIT
+date-month      = 2DIGIT  ; 01-12
+date-mday       = 2DIGIT  ; 01-28, 01-29, 01-30, 01-31 based on
+                          ; month/year
+time-hour       = 2DIGIT  ; 00-23
+time-minute     = 2DIGIT  ; 00-59
+time-second     = 2DIGIT  ; 00-58, 00-59, 00-60 based on leap sec
+                          ; rules
+time-secfrac    = "." 1*DIGIT
+time-numoffset  = ("+" / "-") time-hour ":" time-minute
+time-offset     = "Z" / time-numoffset
+
+partial-time    = time-hour ":" time-minute ":" time-second
+                  [time-secfrac]
+full-date       = date-fullyear "-" date-month "-" date-mday
+full-time       = partial-time time-offset
+
+date-time       = full-date "T" full-time
+DIGIT           =  %x30-39 ; 0-9
+
+
+
Figure 4: +ABNF Definition of RFC3339 Representation of a Date/Time +
+
+
+
+
+
+

+A.2.4. ip: ABNF Definition of Textual Representation of an IP Address +

+

The syntax of the content of ip literals can be described by the +ABNF for IPv4address and IPv6address in Section 3.2.2 of [RFC3986], +as included in slightly updated form in Figure 5.

+
+
+
+
+app-string-ip = IPaddress ["/" uint]
+
+IPaddress     = IPv4address
+              / IPv6address
+
+; ABNF from RFC 3986, re-arranged for PEG compatibility:
+
+IPv6address   =                            6( h16 ":" ) ls32
+              /                       "::" 5( h16 ":" ) ls32
+              / [ h16               ] "::" 4( h16 ":" ) ls32
+              / [ h16 *1( ":" h16 ) ] "::" 3( h16 ":" ) ls32
+              / [ h16 *2( ":" h16 ) ] "::" 2( h16 ":" ) ls32
+              / [ h16 *3( ":" h16 ) ] "::"    h16 ":"   ls32
+              / [ h16 *4( ":" h16 ) ] "::"              ls32
+              / [ h16 *5( ":" h16 ) ] "::"              h16
+              / [ h16 *6( ":" h16 ) ] "::"
+
+h16           = 1*4HEXDIG
+ls32          = ( h16 ":" h16 ) / IPv4address
+IPv4address   = dec-octet "." dec-octet "." dec-octet "." dec-octet
+dec-octet     = "25" %x30-35         ; 250-255
+              / "2" %x30-34 DIGIT    ; 200-249
+              / "1" 2DIGIT           ; 100-199
+              / %x31-39 DIGIT        ; 10-99
+              / DIGIT                ; 0-9
+
+HEXDIG        = DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
+DIGIT         = %x30-39 ; 0-9
+DIGIT1        = %x31-39 ; 1-9
+uint          = "0" / DIGIT1 *DIGIT
+
+
+
Figure 5: +ABNF Definition of Textual Representation of an IP Address +
+
+
+
+
+
+
+
+
+
+

+Appendix B. EDN and CDDL +

+

EDN was designed as a language to provide a human-readable +representation of an instance, i.e., a single CBOR data item or CBOR +sequence. +CDDL was designed as a language to describe an (often large) set of +such instances (which itself constitutes a language), in the form of a +data definition or grammar (or sometimes called schema).

+

The two languages share some similarities, not the least because they +have mutually inspired each other. +But they have very different roots:

+ +

For engineers that are using both EDN and CDDL, it is easy to write +"CDDLisms" or "EDNisms" into their drafts that are meant to be in the +other language. +(This is one more of the many motivations to always validate formal +language instances with tools.)

+

Important differences include:

+ +
+
+
+
+

+Acknowledgements +

+

The concept of application-oriented extensions to diagnostic notation, +as well as the definition for the "dt" extension, were inspired by the +CoRAL work by Klaus Hartke.

+
+
+
+
+

+Author's Address +

+
+
Carsten Bormann
+
Universität Bremen TZI
+
Postfach 330440
+
+D-28359 Bremen +
+
Germany
+
+Phone: ++49-421-218-63921 +
+ +
+
+
+ + + diff --git a/introduce-ai/draft-ietf-cbor-edn-literals.txt b/introduce-ai/draft-ietf-cbor-edn-literals.txt new file mode 100644 index 0000000..743c109 --- /dev/null +++ b/introduce-ai/draft-ietf-cbor-edn-literals.txt @@ -0,0 +1,1624 @@ + + + + +Network Working Group C. Bormann +Internet-Draft Universität Bremen TZI +Intended status: Informational 1 February 2024 +Expires: 4 August 2024 + + +CBOR Extended Diagnostic Notation (EDN): Application-Oriented Literals, + ABNF, and Media Type + draft-ietf-cbor-edn-literals-latest + +Abstract + + The Concise Binary Object Representation, CBOR (STD 94, RFC 8949), + defines a "diagnostic notation" in order to be able to converse about + CBOR data items without having to resort to binary data. + + This document specifies how to add application-oriented extensions to + the diagnostic notation. It then defines two such extensions for + text representations of epoch-based date/times and of IP addresses + and prefixes (RFC 9164). + + A few further additions close some gaps in usability. To facilitate + tool interoperation, this document specifies a formal ABNF definition + for extended diagnostic notation (EDN) that accommodates application- + oriented literals. + +About This Document + + This note is to be removed before publishing as an RFC. + + The latest revision of this draft can be found at https://cbor- + wg.github.io/edn-literal/. Status information for this document may + be found at https://datatracker.ietf.org/doc/draft-ietf-cbor-edn- + literals/. + + Discussion of this document takes place on the cbor Working Group + mailing list (mailto:cbor@ietf.org), which is archived at + https://mailarchive.ietf.org/arch/browse/cbor/. Subscribe at + https://www.ietf.org/mailman/listinfo/cbor/. + + Source for this draft and an issue tracker can be found at + https://github.com/cbor-wg/edn-literal. + +Status of This Memo + + This Internet-Draft is submitted in full conformance with the + provisions of BCP 78 and BCP 79. + + + + +Bormann Expires 4 August 2024 [Page 1] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + Internet-Drafts are working documents of the Internet Engineering + Task Force (IETF). Note that other groups may also distribute + working documents as Internet-Drafts. The list of current Internet- + Drafts is at https://datatracker.ietf.org/drafts/current/. + + Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress." + + This Internet-Draft will expire on 4 August 2024. + +Copyright Notice + + Copyright (c) 2024 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents (https://trustee.ietf.org/ + license-info) in effect on the date of publication of this document. + Please review these documents carefully, as they describe your rights + and restrictions with respect to this document. Code Components + extracted from this document must include Revised BSD License text as + described in Section 4.e of the Trust Legal Provisions and are + provided without warranty as described in the Revised BSD License. + +Table of Contents + + 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 + 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 + 1.2. (Non-)Objectives of this Document . . . . . . . . . . . . 4 + 2. Application-Oriented Extension Literals . . . . . . . . . . . 6 + 2.1. The "dt" Extension . . . . . . . . . . . . . . . . . . . 7 + 2.2. The "ip" Extension . . . . . . . . . . . . . . . . . . . 7 + 3. Stand-in Representations in Binary CBOR . . . . . . . . . . . 8 + 3.1. Handling unknown application-extension identifiers . . . 9 + 3.2. Handling information deliberately elided from an EDN + document . . . . . . . . . . . . . . . . . . . . . . . . 9 + 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 + 4.1. CBOR Diagnostic Notation Application-extension Identifiers + Registry . . . . . . . . . . . . . . . . . . . . . . . . 11 + 4.2. Encoding Indicators . . . . . . . . . . . . . . . . . . . 13 + 4.3. Media Type . . . . . . . . . . . . . . . . . . . . . . . 14 + 4.4. Content-Format . . . . . . . . . . . . . . . . . . . . . 15 + 4.5. Stand-in Tags . . . . . . . . . . . . . . . . . . . . . . 16 + 5. Security considerations . . . . . . . . . . . . . . . . . . . 16 + 6. References . . . . . . . . . . . . . . . . . . . . . . . . . 16 + 6.1. Normative References . . . . . . . . . . . . . . . . . . 16 + + + +Bormann Expires 4 August 2024 [Page 2] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + 6.2. Informative References . . . . . . . . . . . . . . . . . 19 + Appendix A. ABNF Definitions . . . . . . . . . . . . . . . . . . 19 + A.1. Overall ABNF Definition for Extended Diagnostic + Notation . . . . . . . . . . . . . . . . . . . . . . . . 20 + A.2. ABNF Definitions for app-string Content . . . . . . . . . 24 + A.2.1. h: ABNF Definition of Hexadecimal representation of a + byte string . . . . . . . . . . . . . . . . . . . . . 24 + A.2.2. b64: ABNF Definition of Base64 representation of a byte + string . . . . . . . . . . . . . . . . . . . . . . . 25 + A.2.3. dt: ABNF Definition of RFC 3339 Representation of a + Date/Time . . . . . . . . . . . . . . . . . . . . . . 25 + A.2.4. ip: ABNF Definition of Textual Representation of an IP + Address . . . . . . . . . . . . . . . . . . . . . . . 26 + Appendix B. EDN and CDDL . . . . . . . . . . . . . . . . . . . . 27 + Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 29 + Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 29 + +1. Introduction + + For the Concise Binary Object Representation, CBOR, Section 8 of RFC + 8949 [STD94] in conjunction with Appendix G of [RFC8610] defines a + "diagnostic notation" in order to be able to converse about CBOR data + items without having to resort to binary data. Diagnostic notation + syntax is based on JSON, with extensions for representing CBOR + constructs such as binary data and tags. (Standardizing this + together with the actual interchange format does not serve to create + another interchange format, but enables the use of a shared + diagnostic notation in tools for and in documents about CBOR.) + + This document specifies how to add application-oriented extensions to + the diagnostic notation. It then defines two such extensions for + text representations of epoch-based date/times and of IP addresses + and prefixes [RFC9164]. + + A few further additions close some gaps in usability. To facilitate + tool interoperation, this document specifies a formal ABNF definition + for extended diagnostic notation (EDN) that accommodates application- + oriented literals. (See Appendix A.1 for an overall ABNF grammar as + well as the ABNF definitions in Appendix A.2 for grammars for both + the byte string presentations predefined in [STD94] and the + application-extensions). + + In addition, this document finally registers a media type identifier + and a content-format for CBOR diagnostic notation. This does not + elevate its status as an interchange format, but recognizes that + interaction between tools is often smoother if media types can be + used. + + + + +Bormann Expires 4 August 2024 [Page 3] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + +1.1. Terminology + + Section 8 of RFC 8949 [STD94] defines the original CBOR diagnostic + notation, and Appendix G of [RFC8610] supplies a number of extensions + to the diagnostic notation that result in the Extended Diagnostic + Notation (EDN). The diagnostic notation extensions include popular + features such as embedded CBOR (encoded CBOR data items in byte + strings) and comments. A simple diagnostic notation extension that + enables representing CBOR sequences was added in Section 4.2 of + [RFC8742]. As diagnostic notation is not used in the kind of + interchange situations where backward compatibility would pose a + significant obstacle, there is little point in not using these + extensions. + + Therefore, when we refer to "_diagnostic notation_", we mean to + include the original notation from Section 8 of RFC 8949 [STD94] as + well as the extensions from Appendix G of [RFC8610], Section 4.2 of + [RFC8742], and the present document. However, we stick to the + abbreviation "_EDN_" as it has become quite popular and is more + sharply distinguishable from other meanings than "DN" would be. + + In a similar vein, the term "ABNF" in this document refers to the + language defined in [STD68] as extended in [RFC7405], where the + "characters" of Section 2.3 of RFC 5234 [STD68] are Unicode scalar + values. The term "CDDL" refers to the data definition language + defined in [RFC8610] and its registered extensions (such as those in + [RFC9165]), as well as [I-D.ietf-cbor-update-8610-grammar]. + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. + +1.2. (Non-)Objectives of this Document + + Section 8 of RFC 8949 [STD94] states the objective of defining a + human-readable diagnostic notation with CBOR. In particular, it + states: + + | All actual interchange always happens in the binary format. + + + + + + + + + + +Bormann Expires 4 August 2024 [Page 4] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + One important application of EDN is the notation of CBOR data for + humans: in specifications, on whiteboards, and for entering test + data. A number of features, such as comments in string literals, are + mainly useful for people-to-people communication via EDN. Programs + also often output EDN for diagnostic purposes, such as in error + messages or to enable comparison (including generation of diffs via + tools) with test data. + + For comparison with test data, it is often useful if different + implementations generate the same (or similar) output for the same + CBOR data items. This is comparable to the objectives of + deterministic serialization for CBOR data items themselves + (Section 4.2 of RFC 8949 [STD94]). However, there are even more + representation variants in EDN than in binary CBOR, and there is + little point in specifically endorsing a single variant as + "deterministic" when other variants may be more useful for human + understanding, e.g., the << >> notation as opposed to h''; an EDN + generator may have quite a few options that control what presentation + variant is most desirable for the application that it is being used + for. + + Because of this, a deterministic representation is not defined for + EDN, and there is no expectation for "roundtripping" from EDN to CBOR + and back, i.e., for an ability to convert EDN to binary CBOR and back + to EDN while achieving exactly the same result as the original input + EDN — the original EDN possibly was created by humans or by a + different EDN generator. + + However, there is a certain expectation that EDN generators can be + configured to some basic output format, which: + + * looks like JSON where that is possible; + + * inserts encoding indicators only where the binary form differs + from preferred encoding; + + * uses hexadecimal representation (h'') for byte strings, not b64'' + or embedded CBOR (<<>>); + + * does not generate elaborate blank space (newlines, indentation) + for pretty-printing, but does use common blank spaces such as + after , and :. + + + + + + + + + +Bormann Expires 4 August 2024 [Page 5] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + Additional features such as ensuring deterministic map ordering + (Section 4.2 of RFC 8949 [STD94]) on output, or even deviating from + the basic configuration in some systematic way, can further assist in + comparing test data. Information obtained from a CDDL model can help + in choosing application-oriented literals or specific string + representations such as embedded CBOR or b64'' in the appropriate + places. + +2. Application-Oriented Extension Literals + + This document extends the syntax used in diagnostic notation for byte + string literals to also be available for application-oriented + extensions. + + As per Section 8 of RFC 8949 [STD94], the diagnostic notation can + notate byte strings in a number of [RFC4648] base encodings, where + the encoded text is enclosed in single quotes, prefixed by an + identifier (»h« for base16, »b32« for base32, »h32« for base32hex, + »b64« for base64 or base64url). + + This syntax can be thought to establish a name space, with the names + "h", "b32", "h32", and "b64" taken, but other names being + unallocated. The present specification defines additional names for + this namespace, which we call _application-extension identifiers_. + For the quoted string, the same rules apply as for byte strings. In + particular, the escaping rules that were adapted from JSON strings + are applied equivalently for application-oriented extensions, e.g., + within the quoted string \\ stands for a single backslash and \' + stands for a single quote. + + An application-extension identifier is a name consisting of a lower- + case ASCII letter (a-z) and zero or more additional ASCII characters + that are either lower-case letters or digits (a-z0-9). + + Application-extension identifiers are registered in a registry + (Section 4.1). + + Prefixing a single-quoted string, an application-extension identifier + is used to build an application-oriented extension literal, which + stands for a CBOR data item the value of which is derived from the + text given in the single-quoted string using a procedure defined in + the specification for an application-extension identifier. + + An application-extension (such as dt) MAY also define the meaning of + a variant of the application-extension identifier where each lower- + case character is replaced by its upper-case counterpart (such as + DT), for building an application-oriented extension literal using + that all-uppercase variant as the prefix of a single-quoted string. + + + +Bormann Expires 4 August 2024 [Page 6] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + As a convention for such definitions, using the all-uppercase variant + implies making use of a tag appropriate for this application-oriented + extension (such as tag number 1 for DT). + + Examples for application-oriented extensions to CBOR diagnostic + notation can be found in the following sections. + +2.1. The "dt" Extension + + The application-extension identifier "dt" is used to notate a date/ + time literal that can be used as an Epoch-Based Date/Time as per + Section 3.4.2 of RFC 8949 [STD94]. + + The text of the literal is a Standard Date/Time String as per + Section 3.4.1 of RFC 8949 [STD94]. + + The value of the literal is a number representing the result of a + conversion of the given Standard Date/Time String to an Epoch-Based + Date/Time. If fractional seconds are given in the text (production + time-secfrac in Figure 4), the value is a floating-point number; the + value is an integer number otherwise. In the all-upper-case variant + of the app-prefix, the value is enclosed in a tag number 1. + + As an example, the CBOR diagnostic notation + + dt'1969-07-21T02:56:16Z', + dt'1969-07-21T02:56:16.5Z', + DT'1969-07-21T02:56:16Z' + + is equivalent to + + -14159024, + -14159023.5, + 1(-14159024) + + See Appendix A.2.3 for an ABNF definition for the content of dt + literals. + +2.2. The "ip" Extension + + The application-extension identifier "ip" is used to notate an IP + address literal that can be used as an IP address as per Section 3 of + [RFC9164]. + + The text of the literal is an IPv4address or IPv6address as per + Section 3.2.2 of [RFC3986]. + + + + + +Bormann Expires 4 August 2024 [Page 7] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + With the lower-case app-string ip, the value of the literal is a byte + string representing the binary IP address. With the upper-case app- + string IP, the literal is such a byte string tagged with tag number + 54, if an IPv6address is used, or tag number 52, if an IPv4address is + used. + + As an additional case, the upper-case app-string IP'' can be used + with a prefix such as 2001:db8::/56 or 192.0.2.0/24, with the + equivalent tag as its value. (Note that [RFC9164] representations of + address prefixes need to implement the truncation of the address byte + string as described in Section 4.2 of [RFC9164]; see example below.) + For completeness, the lower-case variant ip'2001:db8::/56' or + ip'192.0.2.0/24' stands for an unwrapped [56,h'20010db8'] or + [24,h'c00002']; however, in this case the information on whether an + address is IPv4 or IPv6 often needs to come from the context. + + Note that there is no direct representation of an address combined + with a prefix length; this can be represented as + 52([ip'192.0.2.42',24]), if needed. + + Examples: the CBOR diagnostic notation + + ip'192.0.2.42', + IP'192.0.2.42', + IP'192.0.2.0/24', + ip'2001:db8::42', + IP'2001:db8::42', + IP'2001:db8::/64' + + is equivalent to + + h'c000022a', + 52(h'c000022a'), + 52([24,h'c00002']), + h'20010db8000000000000000000000042', + 54(h'20010db8000000000000000000000042'), + 54([64,h'20010db8']) + + See Appendix A.2.4 for an ABNF definition for the content of ip + literals. + +3. Stand-in Representations in Binary CBOR + + In some cases, an EDN consumer cannot construct actual CBOR items + that represent the CBOR data intended for eventual interchange. This + document defines stand-in representation for two such cases: + + + + + +Bormann Expires 4 August 2024 [Page 8] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + * The EDN consumer does not know (or does not implement) an + application-extension identifier used in the EDN document + (Section 3.1) but wants to preserve the information for a later + processor. + + * The generator of some EDN intended for human consumption (such as + in a specification document) may not want to include parts of the + final data item, destructively replacing complete subtrees or + possibly just parts of a lengthy string by _elisions_ + (Section 3.2). + +3.1. Handling unknown application-extension identifiers + + When ingesting CBOR diagnostic notation, any application-oriented + extension literals are usually decoded and transformed into the + corresponding data item during ingestion. If an application- + extension is not known or not implemented by the ingesting process, + this is usually an error and processing has to stop. + + However, in certain cases, it can be desirable to exceptionally carry + an uninterpreted application-oriented extension literal in an + ingested data item, allowing to postpone its decoding to a specific + later stage of ingestion. + + This specification defines a CBOR Tag for this purpose: The + Diagnostic Notation Unresolved Application-Extension Tag, tag number + CPA999 (Section 4.5). The content of this tag is an array of two + text strings: The application-extension identifier, and the (escape- + processed) content of the single-quoted string. For example, + dt'1969-07-21T02:56:16Z' can be provisionally represented as /CPA/ + 999(["dt", "1969-07-21T02:56:16Z"]). + + + // RFC-Editor: This document uses the CPA (code point allocation) + // convention described in [I-D.bormann-cbor-draft-numbers]. For + // each usage of the term "CPA", please remove the prefix "CPA" from + // the indicated value and replace the residue with the value + // assigned by IANA; perform an analogous substitution for all other + // occurrences of the prefix "CPA" in the document. Finally, please + // remove this note. + +3.2. Handling information deliberately elided from an EDN document + + When using EDN for exposition in a document or on a whiteboard, it is + often useful to be able to leave out parts of an EDN document that + are not of interest at that point of the exposition. + + + + + +Bormann Expires 4 August 2024 [Page 9] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + To facilitate this, this specification supports the use of an + _ellipsis_ (notated as three or more dots in a row, as in ...) to + indicate parts of an EDN document that have been elided (and + therefore cannot be reconstructed). + + Upon ingesting EDN as a representation of a CBOR data item for + further processing, the occurrence of an ellipsis usually is an error + and processing has to stop. + + However, it is useful to be able to process EDN documents with + ellipses in the automation scripts for the documents using them. + This specification defines a CBOR Tag that can be used in the + ingestion for this purpose: The Diagnostic Notation Ellipsis Tag, tag + number CPA888 (Section 4.5). The content of this tag either is + + 1. null (indicating a data item entirely replaced by an ellipsis), + or it is + + 2. an array, the elements of which are alternating between fragments + of a string and the actual elisions, represented as ellipses + carrying a null as content. + + Elisions can stand in for entire subtrees, e.g. in: + + [1, 2, ..., 3] + , + { "a": 1, + "b": ..., + ...: ... + } + + A single ellipsis (or key/value pair of ellipses) can imply eliding + multiple elements in an array (members in a map); if more detailed + control is required, a data definition language such as CDDL can be + employed. (Note that the stand-in form defined here does not allow + multiple key/value pairs with an ellipsis as a key: the CBOR data + item would not be valid.) + + Subtree elisions can be represented in a CBOR data item by using + /CPA/888(null) as the stand-in: + + [1, 2, 888(null), 3] + , + { "a": 1, + "b": 888(null), + 888(null): 888(null) + } + + + + +Bormann Expires 4 August 2024 [Page 10] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + Elisions also can be used as part of a (text or byte) string: + + { "contract": "Herewith I buy" ... "gned: Alice & Bob", + "signature": h'4711...0815', + } + + The example "contract" uses string concatenation as per Appendix G.4 + of [RFC8610], extending that by allowing ellipses; while the example + "signature" uses special syntax that allows the use of ellipses + between the bytes notated _inside_ h'' literals. + + String elisions can be represented in a CBOR data item by a stand-in + that wraps an array of string fragments alternating with ellipsis + indicators: + + { "contract": /CPA/888(["Herewith I buy", 888(null), + "gned: Alice & Bob"]), + "signature": 888([h'4711', 888(null), h'0815']), + } + + Note that the use of elisions is different from "commenting out" EDN + text, e.g. + + { "contract": "Herewith I buy" /.../ "gned: Alice & Bob", + "signature": h'4711/.../0815', + # ...: ... + } + + The consumer of this EDN will ignore the comments and therefore will + have no idea after ingestion that some information has been elided; + validation steps may then simply fail instead of being informed about + the elisions. + +4. IANA Considerations + + + // RFC Editor: please replace RFC-XXXX with the RFC number of this + // RFC, [IANA.cbor-diagnostic-notation] with a reference to the new + // registry group, and remove this note. + +4.1. CBOR Diagnostic Notation Application-extension Identifiers + Registry + + IANA is requested to create an "Application-Extension Identifiers" + registry in a new "CBOR Diagnostic Notation" registry group + [IANA.cbor-diagnostic-notation], with the policy "expert review" + (Section 4.5 of RFC 8126 [BCP26]). + + + + +Bormann Expires 4 August 2024 [Page 11] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + The experts are instructed to be frugal in the allocation of + application-extension identifiers that are suggestive of generally + applicable semantics, keeping them in reserve for application- + extensions that are likely to enjoy wide use and can make good use of + their conciseness. The expert is also instructed to direct the + registrant to provide a specification (Section 4.6 of RFC 8126 + [BCP26]), but can make exceptions, for instance when a specification + is not available at the time of registration but is likely + forthcoming. If the expert becomes aware of application-extension + identifiers that are deployed and in use, they may also initiate a + registration on their own if they deem such a registration can avert + potential future collisions. + + Each entry in the registry must include: + + Application-Extension Identifier: + a lower case ASCII [STD80] string that starts with a letter and + can contain letters and digits after that ([a-z][a-z0-9]*). No + other entry in the registry can have the same application- + extension identifier. + + Description: + a brief description + + Change Controller: + (see Section 2.3 of RFC 8126 [BCP26]) + + Reference: + a reference document that provides a description of the + application-extension identifier + + The initial content of the registry is shown in Table 1; all initial + entries have the Change Controller "IETF". + + + + + + + + + + + + + + + + + + +Bormann Expires 4 August 2024 [Page 12] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + +==================================+===================+===========+ + | Application-extension Identifier | Description | Reference | + +==================================+===================+===========+ + | h | Reserved | RFC8949 | + +----------------------------------+-------------------+-----------+ + | b32 | Reserved | RFC8949 | + +----------------------------------+-------------------+-----------+ + | h32 | Reserved | RFC8949 | + +----------------------------------+-------------------+-----------+ + | b64 | Reserved | RFC8949 | + +----------------------------------+-------------------+-----------+ + | dt | Date/Time | RFC-XXXX | + +----------------------------------+-------------------+-----------+ + | ip | IP Address/Prefix | RFC-XXXX | + +----------------------------------+-------------------+-----------+ + + Table 1: Initial Content of Application-extension Identifier + Registry + +4.2. Encoding Indicators + + IANA is requested to create an "Encoding Indicators" registry in the + newly created "CBOR Diagnostic Notation" registry group [IANA.cbor- + diagnostic-notation], with the policy "specification required" + (Section 4.6 of RFC 8126 [BCP26]). + + The experts are instructed to be frugal in the allocation of encoding + indicators that are suggestive of generally applicable semantics, + keeping them in reserve for encoding indicator registrations that are + likely to enjoy wide use and can make good use of their conciseness. + If the expert becomes aware of encoding indicators that are deployed + and in use, they may also solicit a specification and initiate a + registration on their own if they deem such a registration can avert + potential future collisions. + + Each entry in the registry must include: + + Encoding Indicator: + an ASCII [STD80] string that starts with an underscore letter and + can contain zero or more underscores, letters and digits after + that (_[_A-Za-z0-9]*). No other entry in the registry can have + the same Encoding Indicator. + + Description: + a brief description. This description may employ an abbreviation + of the form ai=nn, where nn is the numeric value of the field + _additional information_, the low-order 5 bits of the initial byte + (see Section 3 of RFC 8949 [STD94]). + + + +Bormann Expires 4 August 2024 [Page 13] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + Change Controller: + (see Section 2.3 of RFC 8126 [BCP26]) + + Reference: + a reference document that provides a description of the + application-extension identifier + + The initial content of the registry is shown in Table 2; all initial + entries have the Change Controller "IETF". + + +====================+===================+===========+ + | Encoding Indicator | Description | Reference | + +====================+===================+===========+ + | _ | Indefinite Length | RFC8949, | + | | Encoding (ai=31) | RFC-XXXX | + +--------------------+-------------------+-----------+ + | _i | ai=0 to ai=23 | RFC-XXXX | + +--------------------+-------------------+-----------+ + | _0 | ai=24 | RFC8949, | + | | | RFC-XXXX | + +--------------------+-------------------+-----------+ + | _1 | ai=25 | RFC8949, | + | | | RFC-XXXX | + +--------------------+-------------------+-----------+ + | _2 | ai=26 | RFC8949, | + | | | RFC-XXXX | + +--------------------+-------------------+-----------+ + | _3 | ai=27 | RFC8949, | + | | | RFC-XXXX | + +--------------------+-------------------+-----------+ + + Table 2: Initial Content of Encoding Indicator + Registry + +4.3. Media Type + + IANA is requested to add the following Media-Type to the "Media + Types" registry [IANA.media-types]. + + +=================+=============================+=============+ + | Name | Template | Reference | + +=================+=============================+=============+ + | cbor-diagnostic | application/cbor-diagnostic | RFC-XXXX, | + | | | Section 4.3 | + +-----------------+-----------------------------+-------------+ + + Table 3: New Media Type application/cbor-diagnostic + + + + +Bormann Expires 4 August 2024 [Page 14] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + Type name: application + Subtype name: cbor-diagnostic + Required parameters: N/A + Optional parameters: N/A + Encoding considerations: binary (UTF-8) + Security considerations: Section 5 of RFC XXXX + Interoperability considerations: none + Published specification: Section 4.3 of RFC XXXX + Applications that use this media type: Tools interchanging a human- + readable form of CBOR + Fragment identifier considerations: The syntax and semantics of + fragment identifiers is as specified for "application/cbor". (At + publication of RFC XXXX, there is no fragment identification + syntax defined for "application/cbor".) + Additional information: + Deprecated alias names for this type: N/A + + Magic number(s): N/A + + File extension(s): .diag + + Macintosh file type code(s): N/A + Person & email address to contact for further information: CBOR WG + mailing list (cbor@ietf.org), or IETF Applications and Real-Time + Area (art@ietf.org) + Intended usage: LIMITED USE + Restrictions on usage: CBOR diagnostic notation represents CBOR data + items, which are the format intended for actual interchange. The + media type application/cbor-diagnostic is intended to be used + within documents about CBOR data items, in diagnostics for human + consumption, and in other representations of CBOR data items that + are necessarily text-based such as in configuration files or other + data edited by humans, often under source-code control. + Author/Change controller: IETF + Provisional registration: no + +4.4. Content-Format + + IANA is requested to register a Content-Format number in the "CoAP + Content-Formats" sub-registry, within the "Constrained RESTful + Environments (CoRE) Parameters" Registry [IANA.core-parameters], as + follows: + + + + + + + + + +Bormann Expires 4 August 2024 [Page 15] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + +=============================+================+======+===========+ + | Content-Type | Content Coding | ID | Reference | + +=============================+================+======+===========+ + | application/cbor-diagnostic | - | TBD1 | RFC-XXXX | + +-----------------------------+----------------+------+-----------+ + + Table 4: New Content-Format + + TBD1 is to be assigned from the space 256..999. + +4.5. Stand-in Tags + + + // RFC-Editor: This document uses the CPA (code point allocation) + // convention described in [I-D.bormann-cbor-draft-numbers]. For + // each usage of the term "CPA", please remove the prefix "CPA" from + // the indicated value and replace the residue with the value + // assigned by IANA; perform an analogous substitution for all other + // occurrences of the prefix "CPA" in the document. Finally, please + // remove this note. + + In the "CBOR Tags" registry [IANA.cbor-tags], IANA is requested to + assign the tags in Table 5 from the "specification required" space + (suggested assignments: 888 and 999), with the present document as + the specification reference. + + +========+===========+==================================+===========+ + | Tag | Data | Semantics | Reference | + | | Item | | | + +========+===========+==================================+===========+ + | CPA888 | null or | Diagnostic Notation Ellipsis | RFC-XXXX | + | | array | | | + +--------+-----------+----------------------------------+-----------+ + | CPA999 | array | Diagnostic Notation | RFC-XXXX | + | | | Unresolved Application-Extension | | + +--------+-----------+----------------------------------+-----------+ + + Table 5: Values for Tags + +5. Security considerations + + The security considerations of [STD94] and [RFC8610] apply. + +6. References + +6.1. Normative References + + + + + +Bormann Expires 4 August 2024 [Page 16] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + [BCP26] Best Current Practice 26, + . + At the time of writing, this BCP comprises the following: + + Cotton, M., Leiba, B., and T. Narten, "Guidelines for + Writing an IANA Considerations Section in RFCs", BCP 26, + RFC 8126, DOI 10.17487/RFC8126, June 2017, + . + + [C] International Organization for Standardization, + "Information technology — Programming languages — C", + Fourth Edition, ISO/IEC 9899:2018, June 2018, + . The text of + the standard is also available via + https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf + + [Cplusplus] + International Organization for Standardization, + "Programming languages — C++", Sixth Edition, ISO/ + IEC 14882:2020, December 2020, + . The text of + the standard is also available via + https://isocpp.org/files/papers/N4860.pdf + + [IANA.cbor-tags] + IANA, "Concise Binary Object Representation (CBOR) Tags", + . + + [IANA.core-parameters] + IANA, "Constrained RESTful Environments (CoRE) + Parameters", + . + + [IANA.media-types] + IANA, "Media Types", + . + + [IEEE754] IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE + Std 754-2019, DOI 10.1109/IEEESTD.2019.8766229, + . + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + . + + + + + + +Bormann Expires 4 August 2024 [Page 17] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + [RFC3339] Klyne, G. and C. Newman, "Date and Time on the Internet: + Timestamps", RFC 3339, DOI 10.17487/RFC3339, July 2002, + . + + [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform + Resource Identifier (URI): Generic Syntax", STD 66, + RFC 3986, DOI 10.17487/RFC3986, January 2005, + . + + [RFC7405] Kyzivat, P., "Case-Sensitive String Support in ABNF", + RFC 7405, DOI 10.17487/RFC7405, December 2014, + . + + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, . + + [RFC8610] Birkholz, H., Vigano, C., and C. Bormann, "Concise Data + Definition Language (CDDL): A Notational Convention to + Express Concise Binary Object Representation (CBOR) and + JSON Data Structures", RFC 8610, DOI 10.17487/RFC8610, + June 2019, . + + [RFC8742] Bormann, C., "Concise Binary Object Representation (CBOR) + Sequences", RFC 8742, DOI 10.17487/RFC8742, February 2020, + . + + [RFC9164] Richardson, M. and C. Bormann, "Concise Binary Object + Representation (CBOR) Tags for IPv4 and IPv6 Addresses and + Prefixes", RFC 9164, DOI 10.17487/RFC9164, December 2021, + . + + [STD68] Internet Standard 68, + . + At the time of writing, this STD comprises the following: + + Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax + Specifications: ABNF", STD 68, RFC 5234, + DOI 10.17487/RFC5234, January 2008, + . + + [STD80] Internet Standard 80, + . + At the time of writing, this STD comprises the following: + + Cerf, V., "ASCII format for network interchange", STD 80, + RFC 20, DOI 10.17487/RFC0020, October 1969, + . + + + +Bormann Expires 4 August 2024 [Page 18] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + [STD94] Internet Standard 94, + . + At the time of writing, this STD comprises the following: + + Bormann, C. and P. Hoffman, "Concise Binary Object + Representation (CBOR)", STD 94, RFC 8949, + DOI 10.17487/RFC8949, December 2020, + . + +6.2. Informative References + + [I-D.ietf-cbor-update-8610-grammar] + Bormann, C., "Updates to the CDDL grammar of RFC 8610", + Work in Progress, Internet-Draft, draft-ietf-cbor-update- + 8610-grammar-04, 2 March 2024, + . + + [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data + Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, + . + + [RFC9165] Bormann, C., "Additional Control Operators for the Concise + Data Definition Language (CDDL)", RFC 9165, + DOI 10.17487/RFC9165, December 2021, + . + + [STD90] Internet Standard 90, + . + At the time of writing, this STD comprises the following: + + Bray, T., Ed., "The JavaScript Object Notation (JSON) Data + Interchange Format", STD 90, RFC 8259, + DOI 10.17487/RFC8259, December 2017, + . + +Appendix A. ABNF Definitions + + This appendix collects grammars in ABNF form ([STD68] as extended in + [RFC7405]) that serve to define the syntax of EDN and some + application-oriented literals. + + Implementation note: The ABNF definitions in this appendix are + intended to be useful in a PEG parser interpretation (see Appendix A + of [RFC8610] for an introduction into PEG). + + + + + + +Bormann Expires 4 August 2024 [Page 19] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + +A.1. Overall ABNF Definition for Extended Diagnostic Notation + + This appendix provides an overall ABNF definition for the syntax of + CBOR extended diagnostic notation. + + To complete the parsing of an app-string with prefix, say, p, the + processed sqstr inside it is further parsed using the ABNF definition + specified for the production app-string-p in Appendix A.2. + + For simplicity, the internal parsing for the built-in EDN prefixes is + specified in the same way. ABNF definitions for h'' and b64'' are + provided in Appendix A.2.1 and Appendix A.2.2. However, the prefixes + b32'' and h32'' are not in wide use and an ABNF definition in this + document could therefore not be based on implementation experience. + + seq = S [item S *("," S item S) OC] S + one-item = S item S + item = map / array / tagged + / number / simple + / string / streamstring + + string1 = (tstr / bstr) spec + string1e = string1 / ellipsis + ellipsis = 3*"." ; "..." or more dots + string = string1e *(S string1e) + + number = (basenumber / decnumber / infin) spec + sign = "+" / "-" + decnumber = [sign] (1*DIGIT ["." *DIGIT] / "." 1*DIGIT) + ["e" [sign] 1*DIGIT] + basenumber = [sign] "0" ("x" 1*HEXDIG + [["." *HEXDIG] "p" [sign] 1*DIGIT] + / "x" "." 1*HEXDIG "p" [sign] 1*DIGIT + / "o" 1*ODIGIT + / "b" 1*BDIGIT) + infin = %s"Infinity" + / %s"-Infinity" + / %s"NaN" + simple = %s"false" + / %s"true" + / %s"null" + / %s"undefined" + / %s"simple(" S item S ")" + uint = "0" / DIGIT1 *DIGIT + tagged = uint spec "(" S item S ")" + + app-prefix = lcalpha *lcalnum ; including h and b64 + / ucalpha *ucalnum ; tagged variant, if defined + + + +Bormann Expires 4 August 2024 [Page 20] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + app-string = app-prefix sqstr + sqstr = "'" *single-quoted "'" + bstr = app-string / sqstr / embedded + ; app-string could be any type + tstr = DQUOTE *double-quoted DQUOTE + embedded = "<<" seq ">>" + + array = "[" spec S [item S *("," S item S) OC] "]" + map = "{" spec S [kp S *("," S kp S) OC] "}" + kp = item S ":" S item + + ; We allow %x09 HT in prose, but not in strings + blank = %x09 / %x0A / %x0D / %x20 + non-slash = blank / %x21-2e / %x30-D7FF / %xE000-10FFFF + non-lf = %x09 / %x0D / %x20-D7FF / %xE000-10FFFF + S = *blank *(comment *blank) + comment = "/" *non-slash "/" + / "#" *non-lf %x0A + + ; optional trailing comma (ignored) + OC = ["," S] + + ; check semantically that strings are either all text or all bytes + ; note that there must be at least one string to distinguish + streamstring = "(_" S string S *("," S string S) OC ")" + spec = ["_" *wordchar] + + double-quoted = unescaped + / "'" + / "\" DQUOTE + / "\" escapable + + single-quoted = unescaped + / DQUOTE + / "\" "'" + / "\" escapable + + escapable = %s"b" ; BS backspace U+0008 + / %s"f" ; FF form feed U+000C + / %s"n" ; LF line feed U+000A + / %s"r" ; CR carriage return U+000D + / %s"t" ; HT horizontal tab U+0009 + / "/" ; / slash (solidus) U+002F (JSON!) + / "\" ; \ backslash (reverse solidus) U+005C + / (%s"u" hexchar) ; uXXXX U+XXXX + + hexchar = "{" (1*"0" [ hexscalar ] / hexscalar) "}" + / non-surrogate + + + +Bormann Expires 4 August 2024 [Page 21] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + / (high-surrogate "\" %s"u" low-surrogate) + non-surrogate = ((DIGIT / "A"/"B"/"C" / "E"/"F") 3HEXDIG) + / ("D" ODIGIT 2HEXDIG ) + high-surrogate = "D" ("8"/"9"/"A"/"B") 2HEXDIG + low-surrogate = "D" ("C"/"D"/"E"/"F") 2HEXDIG + hexscalar = "10" 4HEXDIG / HEXDIG1 4HEXDIG + / non-surrogate / 1*3HEXDIG + + ; Note that no other C0 characters are allowed, including %x09 HT + unescaped = %x0A ; new line + / %x0D ; carriage return -- ignored on input + / %x20-21 + ; omit 0x22 " + / %x23-26 + ; omit 0x27 ' + / %x28-5B + ; omit 0x5C \ + / %x5D-D7FF ; skip surrogate code points + / %xE000-10FFFF + + DQUOTE = %x22 ; " double quote + DIGIT = %x30-39 ; 0-9 + DIGIT1 = %x31-39 ; 1-9 + ODIGIT = %x30-37 ; 0-7 + BDIGIT = %x30-31 ; 0-1 + HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F" + HEXDIG1 = DIGIT1 / "A" / "B" / "C" / "D" / "E" / "F" + ; Note: double-quoted strings as in "A" are case-insensitive in ABNF + lcalpha = %x61-7A ; a-z + lcalnum = lcalpha / DIGIT + ucalpha = %x41-5A ; A-Z + ucalnum = ucalpha / DIGIT + wordchar = "_" / lcalnum / ucalpha ; [_a-z0-9A-Z] + + Figure 1 + + While an ABNF grammar defines the set of character strings that are + considered to be valid EDN by this ABNF, the mapping of these + character strings into the generic data model of CBOR is not always + obvious. + + The following additional items should help in the interpretation: + + * decnumber stands for an integer in the usual decimal notation, + unless at least one of the optional parts starting with "." and + "e" are present, in which case it stands for a floating point + value in the usual decimal notation. Note that the grammar now + allows 3. for 3.0 and .3 for 0.3 (also for hexadecimal floating + + + +Bormann Expires 4 August 2024 [Page 22] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + point below); implementers are advised that some platform numeric + parsers accept only a subset of the floating point syntax in this + document and may require some preprocessing to use here. + + * basenumber stands for an integer in the usual base 16/hexadecimal + ("0x"), base 8/octal ("0o"), or base 2/binary ("0b") notation, + unless the optional part containing a "p" is present, in which + case it stands for a floating point number in the usual + hexadecimal notation (which uses a mantissa in hexadecimal and an + exponent in decimal notation, see Section 5.12.3 of [IEEE754], + Section 6.4.4.2 of [C], or Section 5.13.4 of [Cplusplus]; + floating-suffix/floating-point-suffix from the latter two is not + used here). + + * spec stands for an encoding indicator. + + (In the following, an abbreviation of the form ai=nn gives nn as + the numeric value of the field _additional information_, the low- + order 5 bits of the initial byte: see Section 3 of RFC 8949 + [STD94].) + + As per Section 8.1 of RFC 8949 [STD94]: + + - an underscore _ on its own stands for indefinite length + encoding (ai=31, only available behind the opening brace/ + bracket for map and array: strings have a special syntax + streamstring for indefinite length encoding except for the + special cases ''_ and ""_), and + + - _0 to _3 stand for ai=24 to ai=27, respectively. + + Surprisingly, Section 8.1 of RFC 8949 [STD94] does not address + ai=0 to ai=23 — the assumption seems to be that preferred + serialization (Section 4.1 of RFC 8949 [STD94]) will be used when + converting CBOR diagnostic notation to an encoded CBOR data item, + so leaving out the encoding indicator for a data item with a + preferred serialization will implicitly use ai=0 to ai=23 if that + is possible. The present specification allows to make this + explicit: + + - _i ("immediate") stands for encoding with ai=0 to ai=23. + + While no pressing use for further values for encoding indicators + comes to mind, this is an extension point for EDN; Section 4.2 + defines a registry for additional values. + + + + + + +Bormann Expires 4 August 2024 [Page 23] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + * string and the rules preceding it in the same block realize both + the representation of strings that are split up into multiple + chunks (Section G.4 of RFC 8949 [STD94]) and the use of ellipses + to represent elisions (Section 3.2). The semantic processing of + these rules is relatively complex: + + - A single ... is a general ellipsis, which can stand for any + data item. + + - An ellipsis can be surrounded (on one or both sides) by string + chunks, the result is a CBOR tag number CPA888 that contains an + array with joined together spans of such chunks plus the + ellipses represented by 888(null). + + - A simple sequence of string chunks is simply joined together. + In both cases of joining strings, the rules of Section G.4 of + RFC 8949 [STD94] need to be followed; in particular, if a text + string results from the joining operation, that result needs to + be valid UTF-8. + + - Some of the strings may be app-strings. If the type of the + app-string is an actual string, joining of chunked strings + occurs as with directly notated strings; otherwise the + occurrence of more than one app-string or an app-string + together with a directly notated string cannot be processed. + +A.2. ABNF Definitions for app-string Content + + This appendix provides ABNF definitions for application-oriented + extension literals defined in [STD94] and in this specification. + These grammars describe the _decoded_ content of the sqstr components + that combine with the application-extension identifiers to form + application-oriented extension literals. Each of these may make use + of rules defined in Figure 1. + +A.2.1. h: ABNF Definition of Hexadecimal representation of a byte + string + + The syntax of the content of byte strings represented in hex, such as + h'', h'0815', or h'/head/ 63 /contents/ 66 6f 6f' (another + representation of << "foo" >>), is described by the ABNF in Figure 2. + This syntax accommodates both lower case and upper case hex digits, + as well as blank space (including comments) around each hex digit. + + + + + + + + +Bormann Expires 4 August 2024 [Page 24] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + app-string-h = S *(HEXDIG S HEXDIG S / ellipsis S) + ["#" *non-lf] + ellipsis = 3*"." + HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F" + DIGIT = %x30-39 ; 0-9 + blank = %x09 / %x0A / %x0D / %x20 + non-slash = blank / %x21-2e / %x30-10FFFF + non-lf = %x09 / %x0D / %x20-D7FF / %xE000-10FFFF + S = *blank *(comment *blank ) + comment = "/" *non-slash "/" + / "#" *non-lf %x0A + + Figure 2: ABNF Definition of Hexadecimal Representation of a Byte + String + +A.2.2. b64: ABNF Definition of Base64 representation of a byte string + + The syntax of the content of byte strings represented in base64 is + described by the ABNF in Figure 2. + + This syntax allows both the classic (Section 4 of [RFC4648]) and the + URL-safe (Section 5 of [RFC4648]) alphabet to be used. It + accommodates, but does not require base64 padding. Note that + inclusion of classic base64 makes it impossible to have in-line + comments in b64, as "/" is valid base64-classic. + + app-string-b64 = B *(4(b64dig B)) + [b64dig B b64dig B ["=" B "=" / b64dig B ["="]] B] + ["#" *inon-lf] + b64dig = ALPHA / DIGIT / "-" / "_" / "+" / "/" + B = *iblank *(icomment *iblank) + iblank = %x0A / %x20 ; Not HT or CR (gone) + icomment = "#" *inon-lf %x0A + inon-lf = %x20-D7FF / %xE000-10FFFF + ALPHA = %x41-5a / %x61-7a + DIGIT = %x30-39 + + Figure 3: ABNF definition of Base64 Representation of a Byte String + +A.2.3. dt: ABNF Definition of RFC 3339 Representation of a Date/Time + + The syntax of the content of dt literals can be described by the ABNF + for date-time from [RFC3339] as summarized in Section 3 of [RFC9165]: + + + + + + + + +Bormann Expires 4 August 2024 [Page 25] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + app-string-dt = date-time + + date-fullyear = 4DIGIT + date-month = 2DIGIT ; 01-12 + date-mday = 2DIGIT ; 01-28, 01-29, 01-30, 01-31 based on + ; month/year + time-hour = 2DIGIT ; 00-23 + time-minute = 2DIGIT ; 00-59 + time-second = 2DIGIT ; 00-58, 00-59, 00-60 based on leap sec + ; rules + time-secfrac = "." 1*DIGIT + time-numoffset = ("+" / "-") time-hour ":" time-minute + time-offset = "Z" / time-numoffset + + partial-time = time-hour ":" time-minute ":" time-second + [time-secfrac] + full-date = date-fullyear "-" date-month "-" date-mday + full-time = partial-time time-offset + + date-time = full-date "T" full-time + DIGIT = %x30-39 ; 0-9 + + Figure 4: ABNF Definition of RFC3339 Representation of a Date/Time + +A.2.4. ip: ABNF Definition of Textual Representation of an IP Address + + The syntax of the content of ip literals can be described by the ABNF + for IPv4address and IPv6address in Section 3.2.2 of [RFC3986], as + included in slightly updated form in Figure 5. + + + + + + + + + + + + + + + + + + + + + + +Bormann Expires 4 August 2024 [Page 26] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + app-string-ip = IPaddress ["/" uint] + + IPaddress = IPv4address + / IPv6address + + ; ABNF from RFC 3986, re-arranged for PEG compatibility: + + IPv6address = 6( h16 ":" ) ls32 + / "::" 5( h16 ":" ) ls32 + / [ h16 ] "::" 4( h16 ":" ) ls32 + / [ h16 *1( ":" h16 ) ] "::" 3( h16 ":" ) ls32 + / [ h16 *2( ":" h16 ) ] "::" 2( h16 ":" ) ls32 + / [ h16 *3( ":" h16 ) ] "::" h16 ":" ls32 + / [ h16 *4( ":" h16 ) ] "::" ls32 + / [ h16 *5( ":" h16 ) ] "::" h16 + / [ h16 *6( ":" h16 ) ] "::" + + h16 = 1*4HEXDIG + ls32 = ( h16 ":" h16 ) / IPv4address + IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet + dec-octet = "25" %x30-35 ; 250-255 + / "2" %x30-34 DIGIT ; 200-249 + / "1" 2DIGIT ; 100-199 + / %x31-39 DIGIT ; 10-99 + / DIGIT ; 0-9 + + HEXDIG = DIGIT / "A" / "B" / "C" / "D" / "E" / "F" + DIGIT = %x30-39 ; 0-9 + DIGIT1 = %x31-39 ; 1-9 + uint = "0" / DIGIT1 *DIGIT + + Figure 5: ABNF Definition of Textual Representation of an IP Address + +Appendix B. EDN and CDDL + + EDN was designed as a language to provide a human-readable + representation of an instance, i.e., a single CBOR data item or CBOR + sequence. CDDL was designed as a language to describe an (often + large) set of such instances (which itself constitutes a language), + in the form of a _data definition_ or _grammar_ (or sometimes called + _schema_). + + The two languages share some similarities, not the least because they + have mutually inspired each other. But they have very different + roots: + + * EDN syntax is an extension to JSON syntax [STD90]. (Any + (interoperable) JSON text is also valid EDN.) + + + +Bormann Expires 4 August 2024 [Page 27] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + * CDDL syntax is inspired by ABNF's syntax [STD68]. + + For engineers that are using both EDN and CDDL, it is easy to write + "CDDLisms" or "EDNisms" into their drafts that are meant to be in the + other language. (This is one more of the many motivations to always + validate formal language instances with tools.) + + Important differences include: + + * Comment syntax. CDDL inherits ABNF's semicolon-delimited end of + line characters, while EDN finds nothing in JSON that could be + inherited here. Inspired by JavaScript, EDN simplifies + JavaScript's copy of the original C comment syntax to be delimited + by single slashes (where line ends are not of interest); it also + adds end-of-line comments starting with #. + + EDN: + { / alg / 1: -7 / ECDSA 256 / } + , + { 1: # alg + -7 # ECDSA 256 + } + CDDL: ? 1 => int / tstr, ; algorithm identifier + + * Syntax for tags. CDDL's tag syntax is part of the system for + referring to CBOR's fundamentals (the major type 6, in this case) + and (with [I-D.ietf-cbor-update-8610-grammar]) allows specifying + the actual tag number separately, while EDN's tag syntax is a + simple decimal number and a pair of parentheses. + + EDN: 98(['', {}, /rest elided here: …/]) + + CDDL: COSE_Sign_Tagged = #6.98(COSE_Sign) + + * Separator character. Like JSON, EDN requires commas as separators + between array elements and map members (EDN also allows, but does + not require, a trailing comma before the closing bracket/brace, + enabling an easier to maintain "terminator" style of their use). + CDDL's comma separators in these contexts (CDDL groups) are + entirely optional (and actually are terminators, which together + with their optionality allows them to be used like separators as + well, or even not at all). + + * Embedded CBOR. EDN has a special syntax to describe the content + of byte strings that are encoded CBOR data items. CDDL can + specify these with a control operator, which looks very different. + + EDN: 98([/h'a10126'/ << {/alg/ 1: -7 /ECDSA 256/ } >>, /…/]) + + + +Bormann Expires 4 August 2024 [Page 28] + +Internet-Draft CBOR EDN: Literals and ABNF February 2024 + + + CDDL: serialized_map = bytes .cbor header_map + +Acknowledgements + + The concept of application-oriented extensions to diagnostic + notation, as well as the definition for the "dt" extension, were + inspired by the CoRAL work by Klaus Hartke. + +Author's Address + + Carsten Bormann + Universität Bremen TZI + Postfach 330440 + D-28359 Bremen + Germany + Phone: +49-421-218-63921 + Email: cabo@tzi.org + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Bormann Expires 4 August 2024 [Page 29] diff --git a/introduce-ai/index.html b/introduce-ai/index.html new file mode 100644 index 0000000..0cc81f0 --- /dev/null +++ b/introduce-ai/index.html @@ -0,0 +1,45 @@ + + + + cbor-wg/edn-literal introduce-ai preview + + + + +

Editor's drafts for introduce-ai branch of cbor-wg/edn-literal

+ + + + + + +
CBOR EDN: Literals and ABNFplain textsame as main
+ + +