Skip to content

Commit

Permalink
Merge #1670: Introduce O(n) canonicalization algorithm
Browse files Browse the repository at this point in the history
956d0a9 test(chain): Update test docs to stop referencing `get_chain_position` (志宇)
1508ae3 chore: Fix typos (志宇)
af92199 refactor(wallet): Reuse chain position instead of obtaining new one (Jiri Jakes)
caa0f13 docs(wallet): Explain `.take` usage (志宇)
1196405 refactor(chain): Reorganize `TxGraph::insert_anchor` logic for clarity (志宇)
4706315 chore(chain): Address `CanonicalIter` nitpicks (志宇)
68f7b77 test(chain): Add canonicalization test (志宇)
da0c43e refactor(chain)!: Rename `LastSeenIn` to `ObservedIn` (志宇)
d4102b4 perf(chain): add benchmarks for canonicalization logic (志宇)
e34024c feat(chain): Derive `Clone` on `IndexedTxGraph` (志宇)
e985445 docs: Add ADR for `O(n)` canonicalization algorithm (志宇)
4325e2c test(chain): Add transitive anchor tests (志宇)
8fbee12 feat(chain)!: rm `get_chain_position` and associated methods (志宇)
582d6b5 feat(chain)!: `O(n)` canonicalization algorithm (志宇)
f6192a6 feat(chain)!: Add `run_until_finished` methods (志宇)
0aa39f9 feat(chain)!: `TxGraph` contain anchors in one field (志宇)

Pull request description:

  Fixes #1665
  Replaces #1659

  ### Description

  Previously, getting the canonical history of transactions/UTXOs required calling `TxGraph::get_chain_position` on each transaction. This was highly inefficient and resulted in an `O(n^2)` algorithm. The situation is especially problematic when we have many unconfirmed conflicts.

  This PR introduces an `O(n)` algorithm to determine the canonical set of transactions in `TxGraph`. The algorithm's premise is as follows:

  1. If transaction `A` is determined to be canonical, all of `A`'s ancestors must also be canonical.
  2. If transaction `B` is determined to be NOT canonical, all of `B`'s descendants must also be NOT canonical.
  3. If a transaction is anchored in the best chain, it is canonical.
  4. If a transaction conflicts with a canonical transaction, it is NOT canonical.
  5. A transaction with a higher last-seen has precedence.
  6. Last-seen values are transitive. A transaction's collective last-seen value is the max of it's last-seen value and all of it's descendants.

  We maintain two mutually-exclusive `txid` sets: `canoncial` and `not_canonical`.

  Imagine a method `mark_canonical(A)` that is based on premise 1 and 2. This method will mark transaction `A` and all of it's ancestors as canonical. For each transaction that is marked canonical, we can iterate all of it's conflicts and mark those as `non_canonical`. If a transaction already exists in `canoncial` or `not_canonical`, we can break early, avoiding duplicate work.

  This algorithm iterates transactions in 3 runs.

  1. Iterate over all transactions with anchors in descending anchor-height order. For any transaction that has an anchor pointing to the best chain, we call `mark_canonical` on it. We iterate in descending-height order to reduce the number of anchors we need to check against the `ChainOracle` (premise 1). The purpose of this run is to populate `non_canonical` with all transactions that directly conflict with anchored transactions and populate `canonical` with all anchored transactions and ancestors of anchors transactions (transitive anchors).
  2. Iterate over all transactions with last-seen values, in descending last-seen order. We can call `mark_canonical` on all of these that do not already exist in `canonical` or `not_canonical`.
  3. Iterate over remaining transactions that contains anchors (but not in the best chain) and have no last-seen value. We treat these transactions in the same way as we do in run 2.

  #### Benchmarks

  Thank you to @ValuedMammal for working on this.

  ```sh
  $ cargo bench -p bdk_chain --bench canonicalization
  ```

  Benchmark results (this PR):

  ```
  many_conflicting_unconfirmed::list_canonical_txs
                          time:   [709.46 us 710.36 us 711.35 us]
  many_conflicting_unconfirmed::filter_chain_txouts
                          time:   [712.59 us 713.23 us 713.90 us]
  many_conflicting_unconfirmed::filter_chain_unspents
                          time:   [709.95 us 711.16 us 712.45 us]
  many_chained_unconfirmed::list_canonical_txs
                          time:   [2.2604 ms 2.2641 ms 2.2680 ms]
  many_chained_unconfirmed::filter_chain_txouts
                          time:   [3.5763 ms 3.5869 ms 3.5979 ms]
  many_chained_unconfirmed::filter_chain_unspents
                          time:   [3.5540 ms 3.5596 ms 3.5652 ms]
  nested_conflicts_unconfirmed::list_canonical_txs
                          time:   [660.06 us 661.75 us 663.60 us]
  nested_conflicts_unconfirmed::filter_chain_txouts
                          time:   [650.15 us 651.36 us 652.71 us]
  nested_conflicts_unconfirmed::filter_chain_unspents
                          time:   [658.37 us 661.54 us 664.81 us]
  ```

  Benchmark results (master): https://github.com/evanlinjin/bdk/tree/fix/1665-master-bench

  ```
  many_conflicting_unconfirmed::list_canonical_txs
                          time:   [94.618 ms 94.966 ms 95.338 ms]
  many_conflicting_unconfirmed::filter_chain_txouts
                          time:   [159.31 ms 159.76 ms 160.22 ms]
  many_conflicting_unconfirmed::filter_chain_unspents
                          time:   [163.29 ms 163.61 ms 163.96 ms]

  # I gave up running the rest of the benchmarks since they were taking too long.
  ```

  ### Notes to the reviewers

  * ***PLEASE MERGE #1733 BEFORE THIS PR!*** We had to change the signature of `ChainPosition` to account for transitive anchors and unconfirmed transactions with no `last-seen` value.

  * The canonicalization algorithm is contained in `/crates/chain/src/canonical_iter.rs`.

  * Since the algorithm requires traversing transactions ordered by anchor height, and then last-seen values, we introduce two index fields in `TxGraph`; `txs_by_anchor` and `txs_by_last_seen`. Methods `insert_anchor` and `insert_seen_at` are changed to populate these index fields.

  * An ADR is added: `docs/adr/0003_canonicalization_algorithm.md`. This is based on the work in #1592.

  ### Changelog notice

  * Added: Introduce an `O(n)` canonicalization algorithm. This logic is contained in `/crates/chain/src/canonical_iter.rs`.
  * Added: Indexing fields in `TxGraph`; `txs_by_anchor_height` and `txs_by_last_seen`. Pre-indexing allows us to construct the canonical history more efficiently.
  * Removed: `TxGraph` methods: `try_get_chain_position` and `get_chain_position`. This is superseded by the new canonicalization algorithm.

  ### Checklists

  #### All Submissions:

  * [x] I've signed all my commits
  * [x] I followed the [contribution guidelines](https://github.com/bitcoindevkit/bdk/blob/master/CONTRIBUTING.md)
  * [x] I ran `cargo fmt` and `cargo clippy` before committing

  #### New Features:

  * [x] I've added tests for the new feature
  * [x] I've added docs for the new feature

  #### Bugfixes:

  * [x] This pull request breaks the existing API
  * [x] I've added tests to reproduce the issue which are now passing
  * [x] I'm linking the issue being fixed by this PR

ACKs for top commit:
  ValuedMammal:
    ACK 956d0a9
  nymius:
    ACK 956d0a9
  oleonardolima:
    utACK 956d0a9
  jirijakes:
    ACK 956d0a9

Tree-SHA512: 44963224abf1aefb3510c59d0eb27e3a572cd16f46106fd92e8da2e6e12f0671dcc1cd5ffdc4cc80683bc9e89fa990eba044d9c64d9ce02abc29a08f4859b69e
  • Loading branch information
evanlinjin committed Dec 11, 2024
2 parents ab08b8c + 956d0a9 commit 955593c
Show file tree
Hide file tree
Showing 18 changed files with 1,242 additions and 606 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/cont_integration.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@ jobs:
cargo update -p tokio-util --precise "0.7.11"
cargo update -p indexmap --precise "2.5.0"
cargo update -p security-framework-sys --precise "2.11.1"
cargo update -p csv --precise "1.3.0"
cargo update -p unicode-width --precise "0.1.13"
- name: Build
run: cargo build --workspace --exclude 'example_*' ${{ matrix.features }}
- name: Test
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,8 @@ cargo update -p tokio --precise "1.38.1"
cargo update -p tokio-util --precise "0.7.11"
cargo update -p indexmap --precise "2.5.0"
cargo update -p security-framework-sys --precise "2.11.1"
cargo update -p csv --precise "1.3.0"
cargo update -p unicode-width --precise "0.1.13"
```

## License
Expand Down
1 change: 1 addition & 0 deletions crates/bitcoind_rpc/tests/test_emitter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -389,6 +389,7 @@ fn tx_can_become_unconfirmed_after_reorg() -> anyhow::Result<()> {
assert_eq!(
get_balance(&recv_chain, &recv_graph)?,
Balance {
trusted_pending: SEND_AMOUNT * reorg_count as u64,
confirmed: SEND_AMOUNT * (ADDITIONAL_COUNT - reorg_count) as u64,
..Balance::default()
},
Expand Down
6 changes: 5 additions & 1 deletion crates/chain/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,15 @@ rusqlite = { version = "0.31.0", features = ["bundled"], optional = true }
rand = "0.8"
proptest = "1.2.0"
bdk_testenv = { path = "../testenv", default-features = false }

criterion = { version = "0.2" }

[features]
default = ["std", "miniscript"]
std = ["bitcoin/std", "miniscript?/std", "bdk_core/std"]
serde = ["dep:serde", "bitcoin/serde", "miniscript?/serde", "bdk_core/serde"]
hashbrown = ["bdk_core/hashbrown"]
rusqlite = ["std", "dep:rusqlite", "serde"]

[[bench]]
name = "canonicalization"
harness = false
250 changes: 250 additions & 0 deletions crates/chain/benches/canonicalization.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,250 @@
use bdk_chain::{keychain_txout::KeychainTxOutIndex, local_chain::LocalChain, IndexedTxGraph};
use bdk_core::{BlockId, CheckPoint};
use bdk_core::{ConfirmationBlockTime, TxUpdate};
use bdk_testenv::hash;
use bitcoin::{
absolute, constants, hashes::Hash, key::Secp256k1, transaction, Amount, BlockHash, Network,
OutPoint, ScriptBuf, Transaction, TxIn, TxOut,
};
use criterion::{black_box, criterion_group, criterion_main, Criterion};
use miniscript::{Descriptor, DescriptorPublicKey};
use std::sync::Arc;

type Keychain = ();
type KeychainTxGraph = IndexedTxGraph<ConfirmationBlockTime, KeychainTxOutIndex<Keychain>>;

/// New tx guaranteed to have at least one output
fn new_tx(lt: u32) -> Transaction {
Transaction {
version: transaction::Version::TWO,
lock_time: absolute::LockTime::from_consensus(lt),
input: vec![],
output: vec![TxOut::NULL],
}
}

fn spk_at_index(txout_index: &KeychainTxOutIndex<Keychain>, index: u32) -> ScriptBuf {
txout_index
.get_descriptor(())
.unwrap()
.at_derivation_index(index)
.unwrap()
.script_pubkey()
}

fn genesis_block_id() -> BlockId {
BlockId {
height: 0,
hash: constants::genesis_block(Network::Regtest).block_hash(),
}
}

fn tip_block_id() -> BlockId {
BlockId {
height: 100,
hash: BlockHash::all_zeros(),
}
}

/// Add ancestor tx confirmed at `block_id` with `locktime` (used for uniqueness).
/// The transaction always pays 1 BTC to SPK 0.
fn add_ancestor_tx(graph: &mut KeychainTxGraph, block_id: BlockId, locktime: u32) -> OutPoint {
let spk_0 = spk_at_index(&graph.index, 0);
let tx = Transaction {
input: vec![TxIn {
previous_output: OutPoint::new(hash!("bogus"), locktime),
..Default::default()
}],
output: vec![TxOut {
value: Amount::ONE_BTC,
script_pubkey: spk_0,
}],
..new_tx(locktime)
};
let txid = tx.compute_txid();
let _ = graph.insert_tx(tx);
let _ = graph.insert_anchor(
txid,
ConfirmationBlockTime {
block_id,
confirmation_time: 100,
},
);
OutPoint { txid, vout: 0 }
}

fn setup<F: Fn(&mut KeychainTxGraph, &LocalChain)>(f: F) -> (KeychainTxGraph, LocalChain) {
const DESC: &str = "tr([ab28dc00/86h/1h/0h]tpubDCdDtzAMZZrkwKBxwNcGCqe4FRydeD9rfMisoi7qLdraG79YohRfPW4YgdKQhpgASdvh612xXNY5xYzoqnyCgPbkpK4LSVcH5Xv4cK7johH/0/*)";
let cp = CheckPoint::from_block_ids([genesis_block_id(), tip_block_id()])
.expect("blocks must be chronological");
let chain = LocalChain::from_tip(cp).unwrap();

let (desc, _) =
<Descriptor<DescriptorPublicKey>>::parse_descriptor(&Secp256k1::new(), DESC).unwrap();
let mut index = KeychainTxOutIndex::new(10);
index.insert_descriptor((), desc).unwrap();
let mut tx_graph = KeychainTxGraph::new(index);

f(&mut tx_graph, &chain);
(tx_graph, chain)
}

fn run_list_canonical_txs(tx_graph: &KeychainTxGraph, chain: &LocalChain, exp_txs: usize) {
let txs = tx_graph
.graph()
.list_canonical_txs(chain, chain.tip().block_id());
assert_eq!(txs.count(), exp_txs);
}

fn run_filter_chain_txouts(tx_graph: &KeychainTxGraph, chain: &LocalChain, exp_txos: usize) {
let utxos = tx_graph.graph().filter_chain_txouts(
chain,
chain.tip().block_id(),
tx_graph.index.outpoints().clone(),
);
assert_eq!(utxos.count(), exp_txos);
}

fn run_filter_chain_unspents(tx_graph: &KeychainTxGraph, chain: &LocalChain, exp_utxos: usize) {
let utxos = tx_graph.graph().filter_chain_unspents(
chain,
chain.tip().block_id(),
tx_graph.index.outpoints().clone(),
);
assert_eq!(utxos.count(), exp_utxos);
}

pub fn many_conflicting_unconfirmed(c: &mut Criterion) {
const CONFLICTING_TX_COUNT: u32 = 2100;
let (tx_graph, chain) = black_box(setup(|tx_graph, _chain| {
let previous_output = add_ancestor_tx(tx_graph, tip_block_id(), 0);
// Create conflicting txs that spend from `previous_output`.
let spk_1 = spk_at_index(&tx_graph.index, 1);
for i in 1..=CONFLICTING_TX_COUNT {
let tx = Transaction {
input: vec![TxIn {
previous_output,
..Default::default()
}],
output: vec![TxOut {
value: Amount::ONE_BTC - Amount::from_sat(i as u64 * 10),
script_pubkey: spk_1.clone(),
}],
..new_tx(i)
};
let update = TxUpdate {
txs: vec![Arc::new(tx)],
..Default::default()
};
let _ = tx_graph.apply_update_at(update, Some(i as u64));
}
}));
c.bench_function("many_conflicting_unconfirmed::list_canonical_txs", {
let (tx_graph, chain) = (tx_graph.clone(), chain.clone());
move |b| b.iter(|| run_list_canonical_txs(&tx_graph, &chain, 2))
});
c.bench_function("many_conflicting_unconfirmed::filter_chain_txouts", {
let (tx_graph, chain) = (tx_graph.clone(), chain.clone());
move |b| b.iter(|| run_filter_chain_txouts(&tx_graph, &chain, 2))
});
c.bench_function("many_conflicting_unconfirmed::filter_chain_unspents", {
let (tx_graph, chain) = (tx_graph.clone(), chain.clone());
move |b| b.iter(|| run_filter_chain_unspents(&tx_graph, &chain, 1))
});
}

pub fn many_chained_unconfirmed(c: &mut Criterion) {
const TX_CHAIN_COUNT: u32 = 2100;
let (tx_graph, chain) = black_box(setup(|tx_graph, _chain| {
let mut previous_output = add_ancestor_tx(tx_graph, tip_block_id(), 0);
// Create a chain of unconfirmed txs where each subsequent tx spends the output of the
// previous one.
for i in 0..TX_CHAIN_COUNT {
// Create tx.
let tx = Transaction {
input: vec![TxIn {
previous_output,
..Default::default()
}],
..new_tx(i)
};
let txid = tx.compute_txid();
let update = TxUpdate {
txs: vec![Arc::new(tx)],
..Default::default()
};
let _ = tx_graph.apply_update_at(update, Some(i as u64));
// Store the next prevout.
previous_output = OutPoint::new(txid, 0);
}
}));
c.bench_function("many_chained_unconfirmed::list_canonical_txs", {
let (tx_graph, chain) = (tx_graph.clone(), chain.clone());
move |b| b.iter(|| run_list_canonical_txs(&tx_graph, &chain, 2101))
});
c.bench_function("many_chained_unconfirmed::filter_chain_txouts", {
let (tx_graph, chain) = (tx_graph.clone(), chain.clone());
move |b| b.iter(|| run_filter_chain_txouts(&tx_graph, &chain, 1))
});
c.bench_function("many_chained_unconfirmed::filter_chain_unspents", {
let (tx_graph, chain) = (tx_graph.clone(), chain.clone());
move |b| b.iter(|| run_filter_chain_unspents(&tx_graph, &chain, 0))
});
}

pub fn nested_conflicts(c: &mut Criterion) {
const CONFLICTS_PER_OUTPUT: usize = 3;
const GRAPH_DEPTH: usize = 7;
let (tx_graph, chain) = black_box(setup(|tx_graph, _chain| {
let mut prev_ops = core::iter::once(add_ancestor_tx(tx_graph, tip_block_id(), 0))
.collect::<Vec<OutPoint>>();
for depth in 1..GRAPH_DEPTH {
for previous_output in core::mem::take(&mut prev_ops) {
for conflict_i in 1..=CONFLICTS_PER_OUTPUT {
let mut last_seen = depth * conflict_i;
if last_seen % 2 == 0 {
last_seen /= 2;
}
let ((_, script_pubkey), _) = tx_graph.index.next_unused_spk(()).unwrap();
let value =
Amount::ONE_BTC - Amount::from_sat(depth as u64 * 200 - conflict_i as u64);
let tx = Transaction {
input: vec![TxIn {
previous_output,
..Default::default()
}],
output: vec![TxOut {
value,
script_pubkey,
}],
..new_tx(conflict_i as _)
};
let txid = tx.compute_txid();
prev_ops.push(OutPoint::new(txid, 0));
let _ = tx_graph.insert_seen_at(txid, last_seen as _);
let _ = tx_graph.insert_tx(tx);
}
}
}
}));
c.bench_function("nested_conflicts_unconfirmed::list_canonical_txs", {
let (tx_graph, chain) = (tx_graph.clone(), chain.clone());
move |b| b.iter(|| run_list_canonical_txs(&tx_graph, &chain, GRAPH_DEPTH))
});
c.bench_function("nested_conflicts_unconfirmed::filter_chain_txouts", {
let (tx_graph, chain) = (tx_graph.clone(), chain.clone());
move |b| b.iter(|| run_filter_chain_txouts(&tx_graph, &chain, GRAPH_DEPTH))
});
c.bench_function("nested_conflicts_unconfirmed::filter_chain_unspents", {
let (tx_graph, chain) = (tx_graph.clone(), chain.clone());
move |b| b.iter(|| run_filter_chain_unspents(&tx_graph, &chain, 1))
});
}

criterion_group!(
benches,
many_conflicting_unconfirmed,
many_chained_unconfirmed,
nested_conflicts,
);
criterion_main!(benches);
Loading

0 comments on commit 955593c

Please sign in to comment.