Skip to content

Latest commit

 

History

History
205 lines (162 loc) · 6.81 KB

bip-0276.mediawiki

File metadata and controls

205 lines (162 loc) · 6.81 KB

  BIP: 276
  Layer: Applications
  Title: Scheme for encoding typed bitcoin data
  Author: Roger Taylor <[email protected]>
  Status: Draft
  Type: Standards Track
  Created: 2019-10-11
  License: PD

Table of Contents

Abstract

This BIP proposes a scheme for encoding typed bitcoin related data in a user friendly way.

Motivation

In the past, due to the constraints placed on the Bitcoin protocol, the ways it could be used were artificially limited and in most cases it was enough to share addresses, or to copy and paste them where needed. Because addresses were relatively short, and it was possible that a user might choose to type them in manually, base58 was a useful choice.

With the restoration of the protocol, and the wider variety of possibilities like payment destinations that are scripts which lack the ability to be summarised by an address, it becomes necessary to define a way to copy and paste data of arbitrarily length. No real person will be typing addresses, or scripts by hand. We do not need to care about the shorter length, or the danger of characters being transcribed. Instead we should aim to give the user a general context of what they are pasting, by using a textual prefix. And we should aim to give developers some easy insight into the data by not attempting to pack the data, and providing a format that can to some degree be read visually, by encoding all data following the prefix in hexadecimal.

We intentionally retain what base58 offers, including the relevant network, a version for the prefixed data, and a checksum using the same algorithm. However, we aim to do so in a less opaque manner.

Specification

The scheme follows this form:

 <prefix>:<version hex><network hex><data hex><checksum hex>

The version relates to this encoding structure. It has no bearing on the format of the embedded data, any type of data encoded in this structure that has it's own subformat is responsible for taking care of it's own versioning.

Where the currently supported versions are:

Version Description
1 The initial version.

Where the currently supported prefixes are:

Prefix Description
bitcoin-script A complete script as intended to be included in a transaction output.
bitcoin-template ... TODO: some standardised specification for incomplete scripts

Where the encoded segments are:

Segment Structure Description
version hex unsigned byte[1] as hexadecimal Provides the ability to update the structure of the data that follows it.
network hex unsigned byte[1] as hexadecimal Provides the ability to specify that the data is only valid for use on the given network.
data hex unsigned byte[] as hexadecimal The payload data.
checksum hex unsigned byte[4] as hexadecimal Provides the ability to detect whether the complete encoding is fully intact.

Where the currently supported network byte values are:

Network Value
not applicable 0
mainnet 1
testnet 2

The data bytes are of arbitrary length, and are followed by four checksum bytes. The checksum is created by taking the first four bytes of the double SHA256 hash of the text preceding it. Note that it is not the checksum of the data segment, but the checksum of the rest of the encoded string, and is intended to ensure the entire encoded text is intact.

The checksum is calculated over the prefix and all other segments:

 <prefix>:<version hex><network hex><data hex>

Then the hexadecimal encoded checksum is appended to that text:

 <prefix>:<version hex><network hex><data hex><checksum hex>

Type: bitcoin-script

The data segment for version 1 encodings, is solely the bytes that make up a script.

Adding types

When looking to add additional types, or to extend the data embedded in a type, it is worth considering whether it is better to wrap data you have encoded in this scheme in a higher level protocol.

One possible extension to the script type, is an expiration. If someone copies a payment destination and UI eventually receives it, it is useful to be able to warn the user that the given data is old and they might want to obtain a fresh version. However, it is more than likely a much better fit to include the encoded script with context in a higher level protocol.

Another possible extension is what the script is, and how it should be used. Is it spendable? Is it an output script? Is it an input script? These are also things that can defined based on the context the encoded script is supplied. An example of a higher level protocol might be one of the URL formats that includes an address. If an encoded script is included in lieu of an address, then it is implicitly a payment destination and given in the context of an amount and potentially other related data.

Examples

Python

from hashlib import sha256
from typing import Tuple

PREFIX_SCRIPT = "bitcoin-script"
PREFIX_TEMPLATE = "bitcoin-template"
CURRENT_VERSION = 1
NETWORK_MAINNET = 1

def _checksum(data: bytes) -> bytes:
    return sha256(sha256(data).digest()).digest()[0:4]

def bip276_encode(prefix: str, data: bytes, network:int=NETWORK_MAINNET,
        version:int=CURRENT_VERSION) -> str:
    assert version == CURRENT_VERSION
    payload_bytes = bytearray()
    payload_bytes.append(version)
    payload_bytes.append(network)
    payload_bytes.extend(data)
    payload_hex = payload_bytes.hex()
    result = prefix +":"+ payload_hex
    return result + _checksum(result.encode()).hex()

class ChecksumMismatchError(Exception): pass

def bip276_decode(text: str) -> Tuple[str, int, int, bytes]:
    text = text.strip()
    prefix, payload_hex = text.split(":", 1)
    payload_bytes = bytes.fromhex(payload_hex)
    checksum = payload_bytes[-4:]
    version = payload_bytes[0]
    assert version == CURRENT_VERSION
    network = payload_bytes[1]
    data = payload_bytes[2:-4]
    checksummed_bytes = text[:-8].encode()
    local_checksum = _checksum(checksummed_bytes)
    if checksum != local_checksum:
        raise ChecksumMismatchError(f"expected {checksum.hex()}, got {local_checksum.hex()}")
    return prefix, version, network, data

Expected output:

>>> s = bip276_encode(PREFIX_SCRIPT, b"fake script")
>>> s
'bitcoin-script:010166616b65207363726970746f0cd86a'
>>> bip276_decode(s)
('bitcoin-script', 1, 1, b'fake script')

Acknowledgements

Thanks go to Curtis Ellis for his assistance in developing this encoding.

Copyright

This document is placed in the public domain.