Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File data meta gzip generic #8

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion protocol_ids.csv
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@ Prefix,DisplayName,Authors,BitcoinAddress,SpecificationUrl,TxidRedirectUrl
0x00584350,Counterparty Cash,Counterparty Cash Association (CCA),17YhxKgKunLjvi7HB1ENGmeLwPFiaxb74V,https://counterparty.cash,
0x00746C6B,Keyport,Keyport,,https://keyport.io,
0x02446365,SatoshiDICE,Jon Bestman,1DiceuELb5GZktc3CMEv868DMtU3B5957x,https://satoshidice.com,
0x64617461,data,Steve Shadders,,,
0x66696c65,file,Ryan X. Charles,,,
0x677a6970,gzip,Steve Shadders,,,
0x6d657461,meta,Steve Shadders,,,
0x6d6f6e6579627574746f6e2e636f6d,Money Button,Ryan X. Charles,,https://www.moneybutton.com,
0xac1eed88,Miner ID,Steve Shadders,,,
0xff000000,extended protocol id mask,,,,
0x6d6f6e6579627574746f6e2e636f6d,Money Button,Ryan X. Charles,,https://www.moneybutton.com
Empty file.
31 changes: 31 additions & 0 deletions protocols/1.1-file.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
Thanks to Ryan X Charles for the [original spec](https://github.com/bitcoin-sv-specs/op_return/blob/5f051997061ea45873e51a04b494387ad40df05e/protocols/03-file.md)

# Files in OP_RETURN

Protocol identifier: `0x66696c65`

## Specification

The protocol identfier is a `PUSHDATA` data element `0x04 0x66696c65` which is the 1 byte length prefix followed by the 4 byte identifier which translates to the ASCII string `file`.

This should be followed by a two further `PUSHDATA` data elements: `<filename> <data>`

### Filename

Contains any data element. However, it is recommended that the filaname be a utf8 character string. Furthermore, it is recommended that it be divided into name and extension like normal files: `<name>.<extension>` . Operating systems are already used to dealing with filenames and parsing meaning from them, so we can reuse code from operating systems for this.


### Data
Any length of data in any format. However, it is recommended that producers format the file correctly according to the extension. For instance, if your filename is mydocument.pdf then the filedata should be a properly valid PDF file.

### Example

An example of a file is as follows:

```
OP_RETURN 0x04 0x66696c65 0x09 0x68656c6c6f2e747874 0x06 0x68656c6c6f0a
```

This is the file hello.txt consisting of the word "hello" followed by a newline.


31 changes: 31 additions & 0 deletions protocols/1.2-data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# data in OP_RETURN

Protocol identifier: `0x64617461`

## Specification

The protocol identfier is a `PUSHDATA` data element `0x04 0x64617461` which is the 1 byte length prefix followed by the 4 byte identifier which translates to the ASCII string `data`.

This should be followed by a two further `PUSHDATA` data elements: `<type> <data>`

### Type

It is recommended that the `type` field contain a utf8 character string describing the data type. A list of well recognised types may be the subject of a later specification. However allowed valuers are not currently part of this specification.

Example types might include: `bin`, `json`, `jpeg`, `xml`, `html`, `pdf`, `txt`. IANA content types may be supported as well.


### Data
Any length of data in any format. However, it is recommended that the data comply with the format specified in the `type` field.

### Example

An example of a data item is as follows:

```
OP_RETURN 0x04 0x64617461 0x04 6a736f6e 0x0f 0x7b226b6579223a2276616c7565227d
```

This is data of type 'json' with a data value of: "{"key":"value"}"


5 changes: 5 additions & 0 deletions protocols/2.0-markup-protocols.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Markup protocols

Markup protocols are a family of protocols intended to encapsulate other protocols whilst marking them up with additional data. The general pattern is demonstrated by the `'meta'` protocol which wraps any other protocol with JSON key value pairs as metadata.


26 changes: 26 additions & 0 deletions protocols/2.1-meta.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# data in OP_RETURN

Protocol identifier: `0x6d657461`

## Description

The `meta` protocol is a mechanism of adding meta data to any other protocol by encapsulating it.

## Specification

The protocol identfier is a `PUSHDATA` data element `0x04 0x6d657461` which is the 1 byte length prefix followed by the 4 byte identifier which translates to the ASCII string `meta`.

This should be followed by a further `PUSHDATA` data element: `<metadate>` which is a a JSON encoded set of the key value pairs. The key value pairs remain unspecified.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typos: "metadate", "a a". Should be mentioned the data is UTF-8 encoded?


This should be followed by another protocol id and it's subsequent data elemts in their entirety.

### Example

We may for example wish to add authorship and encoding data to a text data item. Using the `data` protocol we should do it like this:

`'meta' '{"author": "alice", "encoding":"utf8"}' 'data' 'txt' 'I am a fish'`

Which would encode to:
`0x04 0x6d657461 0x26 0x7b22617574686f72223a2022616c696365222c2022656e636f64696e67223a2275746638227d 0x04 0x64617461 0x03 747874 0x0b 0x4920616d20612066697368`


58 changes: 58 additions & 0 deletions protocols/3.0-transformation-protocols.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Transformation protocols

Transformation protocols are intended to modify a data element to produce another valid protocol. The general pattern is that a protocol is encapsulated within the transformation protocol and once the transformation handler has complete the output should be another valid protocol. Two examples of this might be `'gzip'` and `'encrypt'`.

# Pattern

Transformation protocols should be able to specify their own parameter patterns. The number of parameters implicitly specifies the begin index of the data element they are operating on, that is, the one immediately following the last parameter. For example if the fictitious protocol `'decompress'` took one parameter `<algorithm>'` The parameter is the 2nd data element in the `OP_RETURN` and the data to be transformed begins at the 3rd element. For optional parameters positional integrity can be maintained by using the null data element for optional parameters you don't wish to specify. A parameter may be another transformation protocol id such that transformations may be nested (e.g. `'tar' 'gzip'`)

Taking `'gzip'` as an example there are two modes of operation possible.

1/ If the transformation protocol and it's parameters are followed by only a single data element it is expected to apply a transformation producing an array of bytes that should then be parsed for a sequence of `PUSHDATA` data elements. The output of the transformation protocol is this newly parsed sequence of data elements.

2/ If the transformation protocol and it's parameters are followed by more than one data element it is expected the data elements will be a sequence of pairs. Each pair is a boolean flag `OP_0` (`0x00`) or `OP_1` (`0x01`), followed by a another data element. The boolean flag indicate with to apply the transformation to that data element. The ouput of the transformation should be sequence of optionally transformed data elements excluding the boolean flags.

In both case the resulting sequence should be treated as a protocol itself with the first data element expected to be a protocol identifier which indicates which handler to pass the data to.

##Example

### Transformation of a single data element

##### Gzip a single data element:

`'gzip' <file_protocol_gz>`

Which should be returned from the gzip handler as:

`'file' <filename> <data>`

Which is then passed to the `file` protocol handler.

### Transformation of multiple data elements

##### Compressed FILE data but uncompressed FILEAME:

`'gzip' 0 'file' 0 <filename> 1 <data_gz>`

This should be passed to the gzip handler which should output:

`'file' <filename> <data>`

Which is then passed to the `file` protocol handler.

### Nested transformations

Using the fictitious transformation protocol `'tar'` which concatenates multiple `'file'` elements into a single output:

`'gzip' 'tar' <file_protocol_tar_gz_>`

First the `'gzip'` transformation handler will decompress the data element yielding:

`'tar' <file_protocol_tar>`

The `'tar'` handler then transforms the output of `'gzip'` into a list of embedded file protocol elements giving:

`'file' <filename1> <data1>` `'file' <filename2> <data2>`



60 changes: 60 additions & 0 deletions protocols/3.1-gzip.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# GZIP protocol identifier

Protocol identifier: `0x677a6970`

GZIP is a transformation protocol intended to encapsulate another protocol.

## Specification

The protocol identfier is a `PUSHDATA` data element `0x04 0x677a6970` which is the 1 byte length prefix followed by the 4 byte identifier which translates to the ASCII string `gzip`.

The `gzip` protocol identifier is intended to transform another `OP_RETURN` based protocol by adding two simple rules:

If the `gzip` identifier is present and only one `PUSHDATA` element follows, that element contains gzipped data. It should be ungzipped then the array of bytes should be parsed for `PUSHDATA` elements then treated as another protocol with the first element being a protocol identier.

If the `gzip` identifier is present and more than one `PUSHDATA` element follows then it should be assumed that each data element is prefixed with a single byte flag indicating whether the following data element is gzipped. `OP_0` (`0x00`) indicates it is not, `OP_1` (`0x01`) indicates that it is. The `gzip` handler should decompress any compressed elements and strip out the flag elements. Returning a decompressed series of `PUSHDATA` elements. This sequence should then be interpreted as a protocol indicated by the first of these data elements as a protocol identifier.

This protocol is stream friendly and can be implemented as a pipeline of stream handlers: `gzip_handler -> sub_handler`

### Examples

##### Gzip a single data element:

`'gzip' <file_protocol_gz>`

Which should be returned from the gzip handler as:

`'file' <filename> <data>`

Which is then passed to the `file` protocol handler.

##### Compressed FILE data but uncompressed FILEAME:

`'gzip' 0 'file' 0 <filename> 1 <data_gz>`

This should be passed to the gzip handler which should output:

`'file' <filename> <data>`

Which is then passed to the `file` protocol handler.

##### Compressed FILE and FILENAME:

`'gzip' 0 'file' 1 <filename_gz> 1 <data_gz>`

This should be passed to the gzip handler which should output:

`'file' <filename> <data>`

Which is then passed to the `file` protocol handler.

##### Compressed DATA but uncompressed TYPE:

`'gzip' 0 'data' 0 <type> 1 <data_gz>`

This should be passed to the gzip handler which should output:

`'data' <type> <data>`

Which is then passed to the `data` protocol handler.