This repository has been archived by the owner on Jun 29, 2022. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
bitcoin: add bitcoin docs (WIP) #270
base: master
Are you sure you want to change the base?
bitcoin: add bitcoin docs (WIP) #270
Changes from all commits
206661a
7927851
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since endianness is usually defined over a multibyte integer type, I am for real not sure which type of "little endian" is meant here ( and casual googling doesn't help ). If I see the following 128bit long payload on disk:
00 11 22 33 44 55 66 77 88 99 aa bb cc dd ee ff
What is the actual value:
33 22 11 00 77 66 55 44 bb aa 99 88 ff ee dd cc
77 66 55 44 33 22 11 00 ff ee dd cc bb aa 99 88
ff ee dd cc bb aa 99 88 77 66 55 44 33 22 11 00
Alternatively - if the on-disk structures are explicitly defined over >64bit integer types: this needs to be called out early, so folks like me get in the right mindset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3, as if you read it entirely as a 32-byte unsigned integer, you read it in the reverse than you would if you treated it as LE. "usually defined over a multibyte integer type" is what's being got at here, but it's 32-bytes, not some repeating sub-pattern.
The "as if" makes me think this is leaning too heavily on the "uint256" thing too much. I'm tempted to remove that language entirely and say it's just a byte string and by convention it gets byte-reversed and turned into hexadecimal when presented publicly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In truth, I never touch this "uint256" thing myself in any of my code. It treat all of these things as byte arrays and then reverse+hexadecimal whenever I need to present the value. Otherwise they're only useful as byte arrays. So I guess that fact in itself suggests the backing out of this concept. It's really just window dressing to make the zeros go at the start of block addresses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh... didn't we kinda/sorta agree among ourselves that floats are there but not to be used, and discouraged at all costs? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I need to work on the language around this more, but the lead-in to this particular snippet of Schema says "the Header may be represented". There's no binary involved in
difficulty
, it's purely something that's only present in the Data Model, just likeversionHex
. It's also not critical that an implementation of an IPLD Bitcoin codec include this since a consumer could derive it for themselves. I've done it so I can generate almost exactly matching JSON to compare against the Bitcoin Core RPC's output to validate my data.There's more of this kind of awkwardness coming up, particularly when it comes to script decoding! When it comes to codecs like Bitcoin, what do we present at the Data Model layer? What does it mean to "decode" Bitcoin data? Do we only present exactly what we find in the bytes? Does that mean we leave out the implicit Links (I'm inserting a
tx
CID to link to transactions and leavingmerkleroot
as a byte array, so it's essentially duplicated but not navigable in the raw form)? There's a bunch of different points along the "decode" spectrum that can be chosen and I think I'll have to write that up below and leave it to implementers to decide how far along they go.I went the whole way with Bitcoin, I can even fully decode scripts into their string form and present them as the Bitcoin Core RPC does, so my decode/encode validation goes both ways and compares against the authoritative and only needs to discount the few chain-contextual pieces of data (
chainwork
etc.). Even as I'm working on Zcash (again), I think there's parts of it where I'm probably going to stop short and shortcut my validation because there's definitely diminishing value as you get into more detail.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, past ~2140, when everyone working on this is dead, this may no longer be true ;P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean that it's only true if people are transacting on Bitcoin and beyond ~2140 there may no longer be transactions? It's still going to be true as long as someone is mining Bitcoin because there's always a coinbase. There cannot exist a "bitcoin block graph" without at least one transaction!
I'm looking through Zcash right now and it's kind of sad how many coinbase-only transactions there are near the head. It makes it look like it now exists to be mined ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This, together with the #int64 below, almost makes one want to say "ipld schema integers are of arbitrary precision", and leave it up to the codecs when to switch the wire-representation, and leave it to codecs when to use a language internal
bigint
and when to use a native integer.This has probably been discussed already, so feel free to ignore with no further discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This goes together with the endianness discussion above: being an integer and a string at the same time can't be a thing.