-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spec: add channel UUID #922
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -111,12 +111,14 @@ All MCAP records are serialized as follows: | |||||
|
||||||
Record type is a single byte opcode, and record content length is a uint64 value. | ||||||
|
||||||
Records may be extended by adding new fields at the end of existing fields. Readers should ignore any unknown fields. | ||||||
Future changes to this specification may extend records by adding new fields at the end of existing fields. Readers should ignore any unknown fields. | ||||||
|
||||||
> The Footer and Message records will not be extended, since their formats do not allow for backward-compatible size changes. | ||||||
|
||||||
Each record definition below contains a `Type` column. See the [Serialization](#serialization) section on how to serialize each type. | ||||||
|
||||||
Record content may end before all known fields have been serialized. Readers should treat a missing field as having that field type's [zero value](#zero-values). | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if we are going to make this statement, we should probably be explicit that records must still be fully formed, even if they are missing fields (i.e. records must match their stated length). otherwise the file should be considered corrupt. |
||||||
|
||||||
### Header (op=0x01) | ||||||
|
||||||
| Bytes | Name | Type | Description | | ||||||
|
@@ -155,13 +157,17 @@ A Channel record defines an encoded stream of messages on a topic. | |||||
|
||||||
Channel records are uniquely identified within a file by their channel ID. A Channel record must occur at least once in the file prior to any message referring to its channel ID. Any two channel records sharing a common ID must be identical. | ||||||
|
||||||
| Bytes | Name | Type | Description | | ||||||
| ----- | ---------------- | --------------------- | ----------------------------------------------------------------------------------------------------------- | | ||||||
| 2 | id | uint16 | A unique identifier for this channel within the file. | | ||||||
| 2 | schema_id | uint16 | The schema for messages on this channel. A schema_id of 0 indicates there is no schema for this channel. | | ||||||
| 4 + N | topic | String | The channel topic. | | ||||||
| 4 + N | message_encoding | String | Encoding for messages on this channel. The [well-known message encodings][message_encodings] are preferred. | | ||||||
| 4 + N | metadata | `Map<string, string>` | Metadata about this channel | | ||||||
The `uuid` field may be used to associate channels between files. For example, an application that merges MCAP files | ||||||
may merge channels with the same non-nil `uuid` into the same output channel. | ||||||
|
||||||
| Bytes | Name | Type | Description | | ||||||
| ----- | ---------------- | --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | ||||||
| 2 | id | uint16 | A unique identifier for this channel within the file. | | ||||||
| 2 | schema_id | uint16 | The schema for messages on this channel. A schema_id of 0 indicates there is no schema for this channel. | | ||||||
| 4 + N | topic | String | The channel topic. | | ||||||
| 4 + N | message_encoding | String | Encoding for messages on this channel. The [well-known message encodings][message_encodings] are preferred. | | ||||||
| 4 + N | metadata | `Map<string, string>` | Metadata about this channel | | ||||||
| 16 | uuid | uuid | A globally unique identifier for this channel. A [nil UUID](https://datatracker.ietf.org/doc/html/rfc4122#section-4.1.7) indicates no globally unique identifier is available. | | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. seems like this could be solved by using channel metadata. is there something that makes this special enough to be an explicitly named field? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Being a first class citizen means that tooling can rely on this field. Metadata is better read as "user-data" that is unspecified. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we have information about such a use case? What if I use something other than UUID to identify channels? I think we should have hard coded fields where there is a demonstrated need for cross-tool interoperability. If this is only to support needs of individual companies then using a named metadata field would be sufficient. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not convinced that UUID is the appropriate term or requirement. I would describe it is an identifier that is stable across multiple separate mcap files (possibly via rolling recording). I think this PR needs to be split up into separate proposals that have clear problem statements. Right now its not in a state that makes a case on what problems it is solving. |
||||||
|
||||||
Channel records may be duplicated in the summary section. | ||||||
|
||||||
|
@@ -319,22 +325,33 @@ A Summary Offset record contains the location of records within the summary sect | |||||
|
||||||
## Serialization | ||||||
|
||||||
### Zero Values | ||||||
|
||||||
Record field types have a defined _zero value_. This is the value that readers should use if that field | ||||||
is missing from the serialized MCAP record. | ||||||
|
||||||
### Fixed-width types | ||||||
|
||||||
Multi-byte integers (`uint16`, `uint32`, `uint64`) are serialized using [little-endian byte order](https://en.wikipedia.org/wiki/Endianness). | ||||||
|
||||||
The [zero value](#zero-values) of all integer types is 0. | ||||||
|
||||||
### String | ||||||
|
||||||
Strings are serialized using a `uint32` byte length followed by the string data, which should be valid [UTF-8](https://en.wikipedia.org/wiki/UTF-8). | ||||||
|
||||||
<byte length><utf-8 bytes> | ||||||
|
||||||
The [zero value](#zero-values) of a string is the empty string. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
### Bytes | ||||||
|
||||||
Bytes is sequence of bytes with no additional requirements. | ||||||
|
||||||
<bytes> | ||||||
|
||||||
The [zero value](#zero-values) of a byte sequence is the empty sequence. | ||||||
|
||||||
### Tuple<first_type, second_type> | ||||||
|
||||||
Tuple represents a pair of values. The first value has type first_type and the second has type second_type. | ||||||
|
@@ -353,6 +370,8 @@ Example `Tuple<uint16, string>`: | |||||
|
||||||
<uint16><uint32><utf-8 bytes> | ||||||
|
||||||
The [zero value](#zero-values) of a tuple is the first value type's zero value followed by the second value type's zero value. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do tuples ever occur outside of an array? I don't think this case is needed. |
||||||
|
||||||
### Array<array_type> | ||||||
|
||||||
Arrays are serialized using a `uint32` byte length followed by the serialized array elements. | ||||||
|
@@ -365,6 +384,8 @@ An array of uint64 is specified as `Array<uint64>` and serialized as: | |||||
|
||||||
> Since arrays use a `uint32` byte length prefix, the maximum size of the serialized array elements cannot exceed 4,294,967,295 bytes. | ||||||
|
||||||
The [zero value](#zero-values) of an array is the empty array. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
### Timestamp | ||||||
|
||||||
`uint64` nanoseconds since a user-understood epoch (i.e unix epoch, robot boot time, etc.) | ||||||
|
@@ -383,6 +404,14 @@ A `Map<string, string>` would be serialized as: | |||||
|
||||||
A serialization which has duplicate keys may cause indeterminate decoding. | ||||||
|
||||||
The [zero value](#zero-values) of a map is an empty map. | ||||||
|
||||||
### UUID | ||||||
|
||||||
An UUID is a 128-bit identifier described in [RFC-4122](https://datatracker.ietf.org/doc/html/rfc4122). | ||||||
|
||||||
The [zero value](#zero-values) of an UUID is the [nil UUID](https://datatracker.ietf.org/doc/html/rfc4122#section-4.1.7). | ||||||
|
||||||
## Diagrams | ||||||
|
||||||
The following diagrams demonstrate various valid MCAP files. | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we are introducing this language it would be good to do in a separate PR, and ensure we have conformance tests for readers well ahead of introducing any new fields.