-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File data meta gzip generic #8
Open
shadders
wants to merge
5
commits into
master
Choose a base branch
from
file-data-meta-gzip-generic
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
da1b315
add file, data, meta and gzip protocol specs
shadders 1041bae
add file, data, meta and gzip protocol specs - update protocol_ids.csv
shadders 54f0fbb
Guideline to come. Modify GZIP and add generic transformation and ma…
shadders 37c79b0
fix my own name
ryanxcharles 72d19a8
Merge pull request #9 from ryanxcharles/patch-2
shadders File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
Thanks to Ryan X Charles for the [original spec](https://github.com/bitcoin-sv-specs/op_return/blob/5f051997061ea45873e51a04b494387ad40df05e/protocols/03-file.md) | ||
|
||
# Files in OP_RETURN | ||
|
||
Protocol identifier: `0x66696c65` | ||
|
||
## Specification | ||
|
||
The protocol identfier is a `PUSHDATA` data element `0x04 0x66696c65` which is the 1 byte length prefix followed by the 4 byte identifier which translates to the ASCII string `file`. | ||
|
||
This should be followed by a two further `PUSHDATA` data elements: `<filename> <data>` | ||
|
||
### Filename | ||
|
||
Contains any data element. However, it is recommended that the filaname be a utf8 character string. Furthermore, it is recommended that it be divided into name and extension like normal files: `<name>.<extension>` . Operating systems are already used to dealing with filenames and parsing meaning from them, so we can reuse code from operating systems for this. | ||
|
||
|
||
### Data | ||
Any length of data in any format. However, it is recommended that producers format the file correctly according to the extension. For instance, if your filename is mydocument.pdf then the filedata should be a properly valid PDF file. | ||
|
||
### Example | ||
|
||
An example of a file is as follows: | ||
|
||
``` | ||
OP_RETURN 0x04 0x66696c65 0x09 0x68656c6c6f2e747874 0x06 0x68656c6c6f0a | ||
``` | ||
|
||
This is the file hello.txt consisting of the word "hello" followed by a newline. | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
# data in OP_RETURN | ||
|
||
Protocol identifier: `0x64617461` | ||
|
||
## Specification | ||
|
||
The protocol identfier is a `PUSHDATA` data element `0x04 0x64617461` which is the 1 byte length prefix followed by the 4 byte identifier which translates to the ASCII string `data`. | ||
|
||
This should be followed by a two further `PUSHDATA` data elements: `<type> <data>` | ||
|
||
### Type | ||
|
||
It is recommended that the `type` field contain a utf8 character string describing the data type. A list of well recognised types may be the subject of a later specification. However allowed valuers are not currently part of this specification. | ||
|
||
Example types might include: `bin`, `json`, `jpeg`, `xml`, `html`, `pdf`, `txt`. IANA content types may be supported as well. | ||
|
||
|
||
### Data | ||
Any length of data in any format. However, it is recommended that the data comply with the format specified in the `type` field. | ||
|
||
### Example | ||
|
||
An example of a data item is as follows: | ||
|
||
``` | ||
OP_RETURN 0x04 0x64617461 0x04 6a736f6e 0x0f 0x7b226b6579223a2276616c7565227d | ||
``` | ||
|
||
This is data of type 'json' with a data value of: "{"key":"value"}" | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# Markup protocols | ||
|
||
Markup protocols are a family of protocols intended to encapsulate other protocols whilst marking them up with additional data. The general pattern is demonstrated by the `'meta'` protocol which wraps any other protocol with JSON key value pairs as metadata. | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# data in OP_RETURN | ||
|
||
Protocol identifier: `0x6d657461` | ||
|
||
## Description | ||
|
||
The `meta` protocol is a mechanism of adding meta data to any other protocol by encapsulating it. | ||
|
||
## Specification | ||
|
||
The protocol identfier is a `PUSHDATA` data element `0x04 0x6d657461` which is the 1 byte length prefix followed by the 4 byte identifier which translates to the ASCII string `meta`. | ||
|
||
This should be followed by a further `PUSHDATA` data element: `<metadate>` which is a a JSON encoded set of the key value pairs. The key value pairs remain unspecified. | ||
|
||
This should be followed by another protocol id and it's subsequent data elemts in their entirety. | ||
|
||
### Example | ||
|
||
We may for example wish to add authorship and encoding data to a text data item. Using the `data` protocol we should do it like this: | ||
|
||
`'meta' '{"author": "alice", "encoding":"utf8"}' 'data' 'txt' 'I am a fish'` | ||
|
||
Which would encode to: | ||
`0x04 0x6d657461 0x26 0x7b22617574686f72223a2022616c696365222c2022656e636f64696e67223a2275746638227d 0x04 0x64617461 0x03 747874 0x0b 0x4920616d20612066697368` | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
# Transformation protocols | ||
|
||
Transformation protocols are intended to modify a data element to produce another valid protocol. The general pattern is that a protocol is encapsulated within the transformation protocol and once the transformation handler has complete the output should be another valid protocol. Two examples of this might be `'gzip'` and `'encrypt'`. | ||
|
||
# Pattern | ||
|
||
Transformation protocols should be able to specify their own parameter patterns. The number of parameters implicitly specifies the begin index of the data element they are operating on, that is, the one immediately following the last parameter. For example if the fictitious protocol `'decompress'` took one parameter `<algorithm>'` The parameter is the 2nd data element in the `OP_RETURN` and the data to be transformed begins at the 3rd element. For optional parameters positional integrity can be maintained by using the null data element for optional parameters you don't wish to specify. A parameter may be another transformation protocol id such that transformations may be nested (e.g. `'tar' 'gzip'`) | ||
|
||
Taking `'gzip'` as an example there are two modes of operation possible. | ||
|
||
1/ If the transformation protocol and it's parameters are followed by only a single data element it is expected to apply a transformation producing an array of bytes that should then be parsed for a sequence of `PUSHDATA` data elements. The output of the transformation protocol is this newly parsed sequence of data elements. | ||
|
||
2/ If the transformation protocol and it's parameters are followed by more than one data element it is expected the data elements will be a sequence of pairs. Each pair is a boolean flag `OP_0` (`0x00`) or `OP_1` (`0x01`), followed by a another data element. The boolean flag indicate with to apply the transformation to that data element. The ouput of the transformation should be sequence of optionally transformed data elements excluding the boolean flags. | ||
|
||
In both case the resulting sequence should be treated as a protocol itself with the first data element expected to be a protocol identifier which indicates which handler to pass the data to. | ||
|
||
##Example | ||
|
||
### Transformation of a single data element | ||
|
||
##### Gzip a single data element: | ||
|
||
`'gzip' <file_protocol_gz>` | ||
|
||
Which should be returned from the gzip handler as: | ||
|
||
`'file' <filename> <data>` | ||
|
||
Which is then passed to the `file` protocol handler. | ||
|
||
### Transformation of multiple data elements | ||
|
||
##### Compressed FILE data but uncompressed FILEAME: | ||
|
||
`'gzip' 0 'file' 0 <filename> 1 <data_gz>` | ||
|
||
This should be passed to the gzip handler which should output: | ||
|
||
`'file' <filename> <data>` | ||
|
||
Which is then passed to the `file` protocol handler. | ||
|
||
### Nested transformations | ||
|
||
Using the fictitious transformation protocol `'tar'` which concatenates multiple `'file'` elements into a single output: | ||
|
||
`'gzip' 'tar' <file_protocol_tar_gz_>` | ||
|
||
First the `'gzip'` transformation handler will decompress the data element yielding: | ||
|
||
`'tar' <file_protocol_tar>` | ||
|
||
The `'tar'` handler then transforms the output of `'gzip'` into a list of embedded file protocol elements giving: | ||
|
||
`'file' <filename1> <data1>` `'file' <filename2> <data2>` | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
# GZIP protocol identifier | ||
|
||
Protocol identifier: `0x677a6970` | ||
|
||
GZIP is a transformation protocol intended to encapsulate another protocol. | ||
|
||
## Specification | ||
|
||
The protocol identfier is a `PUSHDATA` data element `0x04 0x677a6970` which is the 1 byte length prefix followed by the 4 byte identifier which translates to the ASCII string `gzip`. | ||
|
||
The `gzip` protocol identifier is intended to transform another `OP_RETURN` based protocol by adding two simple rules: | ||
|
||
If the `gzip` identifier is present and only one `PUSHDATA` element follows, that element contains gzipped data. It should be ungzipped then the array of bytes should be parsed for `PUSHDATA` elements then treated as another protocol with the first element being a protocol identier. | ||
|
||
If the `gzip` identifier is present and more than one `PUSHDATA` element follows then it should be assumed that each data element is prefixed with a single byte flag indicating whether the following data element is gzipped. `OP_0` (`0x00`) indicates it is not, `OP_1` (`0x01`) indicates that it is. The `gzip` handler should decompress any compressed elements and strip out the flag elements. Returning a decompressed series of `PUSHDATA` elements. This sequence should then be interpreted as a protocol indicated by the first of these data elements as a protocol identifier. | ||
|
||
This protocol is stream friendly and can be implemented as a pipeline of stream handlers: `gzip_handler -> sub_handler` | ||
|
||
### Examples | ||
|
||
##### Gzip a single data element: | ||
|
||
`'gzip' <file_protocol_gz>` | ||
|
||
Which should be returned from the gzip handler as: | ||
|
||
`'file' <filename> <data>` | ||
|
||
Which is then passed to the `file` protocol handler. | ||
|
||
##### Compressed FILE data but uncompressed FILEAME: | ||
|
||
`'gzip' 0 'file' 0 <filename> 1 <data_gz>` | ||
|
||
This should be passed to the gzip handler which should output: | ||
|
||
`'file' <filename> <data>` | ||
|
||
Which is then passed to the `file` protocol handler. | ||
|
||
##### Compressed FILE and FILENAME: | ||
|
||
`'gzip' 0 'file' 1 <filename_gz> 1 <data_gz>` | ||
|
||
This should be passed to the gzip handler which should output: | ||
|
||
`'file' <filename> <data>` | ||
|
||
Which is then passed to the `file` protocol handler. | ||
|
||
##### Compressed DATA but uncompressed TYPE: | ||
|
||
`'gzip' 0 'data' 0 <type> 1 <data_gz>` | ||
|
||
This should be passed to the gzip handler which should output: | ||
|
||
`'data' <type> <data>` | ||
|
||
Which is then passed to the `data` protocol handler. | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typos: "metadate", "a a". Should be mentioned the data is UTF-8 encoded?