-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Peer storage feature #2943
base: main
Are you sure you want to change the base?
Peer storage feature #2943
Conversation
b798c8e
to
c783fd3
Compare
lightning/src/ln/channelmanager.rs
Outdated
if let Some(funding_txo) = chan.funding_txo { | ||
found_funded_chan = true; | ||
let peer_storage_update = ChannelMonitorUpdate { | ||
update_id: CLOSED_CHANNEL_UPDATE_ID, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets use a real update_id fetched (and incremented) from the channel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't see any helper in ChannelContext
to fetch_and_increment_update_id(), should I create one?
lightning/src/ln/channelmanager.rs
Outdated
|
||
let mut found_funded_chan = false; | ||
for chan in &sorted_chan_info { | ||
if let Some(funding_txo) = chan.funding_txo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets also require the channel have confirmations (or be 0conf).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am filtering ChannelPhase::Funded
, that'd cover this I think?
Thanks for the review @TheBlueMatt, I was afk last week. I will work on these changes in this week. |
b7f4bbc
to
6ecf1b8
Compare
a23367f
to
8933816
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, sorry I missed your review request. I generally don't see them as Github doesn't generate email notifications for them.
lightning/src/ln/channelmanager.rs
Outdated
let mut peer_storage = VecWriter(Vec::new()); | ||
self.our_peer_storage.read().unwrap().write(&mut peer_storage).unwrap(); | ||
let mut encrypted_blob = vec![0;peer_storage.0.len() + 16]; | ||
self.inbound_payment_key.encrypt_our_peer_storage(&mut encrypted_blob, 0u64, b"", &peer_storage.0[..]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should define a new key for this, the inbound_payment_key
may be shared across multiple nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I defined a new method inside NodeManager
, get_peer_storage_key
. It derives a key from node_secret
through the KeysManager
.
lightning/src/ln/channelmanager.rs
Outdated
pub fn get_encrypted_our_peer_storage(&self) -> Vec<u8> { | ||
let mut peer_storage = VecWriter(Vec::new()); | ||
self.our_peer_storage.read().unwrap().write(&mut peer_storage).unwrap(); | ||
let mut encrypted_blob = vec![0;peer_storage.0.len() + 16]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We shouldn't need a new vec here, we should be able to encrypt in-place.
lightning/src/ln/channelmanager.rs
Outdated
@@ -2095,6 +2160,7 @@ where | |||
entropy_source: ES, | |||
node_signer: NS, | |||
signer_provider: SP, | |||
our_peer_storage: FairRwLock<OurPeerStorage>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont see where this is updated at runtime? Presumably we need to be able to update each channel when it receives a new RAA
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now this gets updated everytime we send a LatestCounterpartyCommitmentTXInfo
.
best_block, | ||
counterparty_node_id: Some(counterparty_node_id), | ||
initial_counterparty_commitment_info: None, | ||
balances_empty_height: None, | ||
}) | ||
} | ||
|
||
pub(crate) fn new_stub(secp_ctx: Secp256k1<secp256k1::All>, stub_channel: &StubChannel, best_block: BestBlock, keys: Signer, funding_info_scriptbuf: ScriptBuf) -> ChannelMonitor<Signer> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda wonder if we shouldn't move StubChannel
to the chain
module and refactor ChannelMonitor
methods to operate on either a full ChannelMonitor
or a StubChannel
? Its at least something to explore as we go to test this logic and see if its simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we'd need the chain
module to detect the funding spent and to sweep funds, maybe create new methods inside chain
module to handle stubchannel
?
Watch_stub
?
1317384
to
d609779
Compare
4a428cb
to
31b4912
Compare
a7a12b0
to
6c893d6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find myself wondering what the right UX here is - we generally assume that users will not lose ChannelMonitor
data, no matter what, and this seems mostly targeted at addressing the case where a user somehow loses a single (or a few) ChannelMonitor
s for closed channels (if the channels are open, we'll still fail to load).
I think really this feature is useful in one of two cases -
(a) we can use it to detect that we're actually running with stale state by trusting any one of our peers to tell us that, rather than trusting the peer with whom we have a stale channel to tell us our channel is state. This requires much less data (just the latest commitment numbers for each channel) and is a quick and cheap check in ChannelManager
which we can respond to by simply panic'ing.
(b) Handling the case where we've lost all our (latest) data and we want to recover from just our cryptographic keys (and maybe some stale ChannelMonitor
s). In this case, we really don't want to start with a ChannelManager
at all, we want to have some new RecoveryPeerConnector
struct that handles peer connections and processes the backup data we're adding here.
I think both are relatively easy changes to this PR, though.
lightning/src/chain/chainmonitor.rs
Outdated
let entry = match stub_monitors.entry(funding_outpoint) { | ||
hash_map::Entry::Occupied(_) => { | ||
log_error!(logger, "Failed to add new channel data: channel monitor for given outpoint is already present"); | ||
return Err(()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should see if this stub monitor has any new information that the previous one did not and update, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out. Fixed it.
Self::from_impl(ChannelMonitorImpl { | ||
latest_update_id: STUB_CHANNEL_UPDATE_IDENTIFIER, | ||
commitment_transaction_number_obscure_factor: stub_channel.obscure_factor, | ||
destination_script: ScriptBuf::new(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We absolutely need a destination_script to know where to send money.
funding_redeemscript: ScriptBuf::new(), | ||
channel_value_satoshis: stub_channel.channel_value_stoshis, | ||
their_cur_per_commitment_points: stub_channel.their_cur_per_commitment_points, | ||
on_holder_tx_csv: 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this to know when we can claim funds (and I think also build witnesses)?
lightning/src/ln/channelmanager.rs
Outdated
@@ -2232,7 +2237,8 @@ where | |||
entropy_source: ES, | |||
node_signer: NS, | |||
signer_provider: SP, | |||
|
|||
our_peer_storage: FairRwLock<OurPeerStorage>, | |||
peer_storage: Mutex<HashMap<PublicKey, Vec<u8>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be in a PeerHolder
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can shift it to PeerState, but i think handling it here is simpler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Putting it here is another thing for us to have to clean up when we no longer have channels. Putting it with the rest of the peer state consolidates logic and simplifies things.
lightning/src/ln/channelmanager.rs
Outdated
@@ -7432,6 +7449,77 @@ where | |||
} | |||
} | |||
|
|||
fn internal_peer_storage(&self, counterparty_node_id: &PublicKey, msg: &msgs::PeerStorageMessage) { | |||
let per_peer_state = self.per_peer_state.write().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need a write lock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, thanks for pointing this out :)
lightning/src/ln/channelmanager.rs
Outdated
|
||
let mut res = vec![0; msg.data.len() - 16]; | ||
|
||
match our_peer_storage.decrypt_our_peer_storage(&mut res, msg.data.as_slice(), self.our_peerstorage_encryption_key) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, there's gotta be a more efficient way to handle this message. We're expected to receive it every time we connect, and peers may well send it more often. The ChainMonitor
can be smarter here, which may be the best way to go about it - pass the whole blob over the fence and let it decode them/check if it has corresponding ChannelMonitor
s for everything before deriving the new keys, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's better I think. So I'll write a function which processes the blob and only create the keys if we need to panic and persist the channel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Starting to think instead of doing this we should just compare the decrypted contents with the state of our live channels. Sure, its not entirely robust, but it should be pretty good (and, notably, means at least our current channels are safe to update), but it also avoids changing the interface with ChainMonitor
, which would be really nice.
lightning/src/ln/channelmanager.rs
Outdated
@@ -9978,6 +10108,16 @@ where | |||
let _ = handle_error!(self, self.internal_funding_signed(counterparty_node_id, msg), *counterparty_node_id); | |||
} | |||
|
|||
fn handle_peer_storage(&self, counterparty_node_id: &PublicKey, msg: &msgs::PeerStorageMessage) { | |||
let _persistence_guard = PersistenceNotifierGuard::notify_on_drop(self); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let _persistence_guard = PersistenceNotifierGuard::notify_on_drop(self); | |
// There's no reason to cause a full persistence just because a peer updated their storage, and it will | |
// never create new events, so just skip all notification. | |
let _persistence_guard = PersistenceNotifierGuard::optionally_notify(self, || NotifyOption::SkipPersistNoEvents); |
lightning/src/ln/channelmanager.rs
Outdated
let pending_msg_events = &mut peer_state.pending_msg_events; | ||
let peer_storage = self.peer_storage.lock().unwrap().get(counterparty_node_id).unwrap_or(&Vec::<u8>::new()).clone(); | ||
|
||
if peer_storage.len() > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont' see why we should only send if the length is > 0, shouldn't we send if there's any value in the map at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed this. Thanks :)
lightning/src/ln/channelmanager.rs
Outdated
(peer_storage.len() as u64).write(writer)?; | ||
for (node_id, peer_data) in peer_storage.iter() { | ||
node_id.write(writer)?; | ||
peer_data.write(writer)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes the encoding of ChannelManager
such that we cannot upgrade/downgrade. Instead, new data have to be written as TLVs.
lightning/src/ln/channelmanager.rs
Outdated
channel.context.get_commitment_txn_number_obscure_factor(), | ||
new_hash_map(), | ||
None, | ||
channel.context.channel_transaction_parameters.channel_type_features.clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum, why are we creating a stub channel for all of our existing channels on each startup? They should already exist as full ChannelMonitor
s so this is redundant, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used to build our own latest backup which would be distributed to our peers.
6c893d6
to
c5c8f06
Compare
c5c8f06
to
383e41a
Compare
0ea098a
to
1574e6d
Compare
lightning/src/ln/channelmanager.rs
Outdated
@@ -2232,7 +2237,8 @@ where | |||
entropy_source: ES, | |||
node_signer: NS, | |||
signer_provider: SP, | |||
|
|||
our_peer_storage: FairRwLock<OurPeerStorage>, | |||
peer_storage: Mutex<HashMap<PublicKey, Vec<u8>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Putting it here is another thing for us to have to clean up when we no longer have channels. Putting it with the rest of the peer state consolidates logic and simplifies things.
lightning/src/ln/channelmanager.rs
Outdated
/// - [`ChannelMessageHandler`] to handle off-chain channel activity from peers | ||
/// - [`MessageSendEventsProvider`] to similarly send such messages to peers | ||
/// | ||
pub struct FundRecoverer<SP: Deref, L:Deref, M: Deref> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets put this in a new module/file.
lightning/src/chain/mod.rs
Outdated
@@ -305,6 +305,55 @@ pub trait Watch<ChannelSigner: EcdsaChannelSigner> { | |||
/// For details on asynchronous [`ChannelMonitor`] updating and returning | |||
/// [`MonitorEvent::Completed`] here, see [`ChannelMonitorUpdateStatus::InProgress`]. | |||
fn release_pending_monitor_events(&self) -> Vec<(OutPoint, ChannelId, Vec<MonitorEvent>, Option<PublicKey>)>; | |||
|
|||
/// Watches a dummy channel identified by `funding_txo` using `monitor`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we're using a different struct from ChannelManager
, IMO we should use a different persistence trait.
lightning/src/ln/channelmanager.rs
Outdated
SP::Target: SignerProvider, | ||
L::Target: Logger | ||
{ | ||
fn handle_open_channel(&self, _their_node_id: &PublicKey, _msg: &msgs::OpenChannel) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should reject open channels (see how IgnoringMessageHandler
deals with different messages and ~copy that).
lightning/src/ln/channelmanager.rs
Outdated
|
||
let mut res = vec![0; msg.data.len() - 16]; | ||
|
||
match our_peer_storage.decrypt_our_peer_storage(&mut res, msg.data.as_slice(), self.our_peerstorage_encryption_key) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Starting to think instead of doing this we should just compare the decrypted contents with the state of our live channels. Sure, its not entirely robust, but it should be pretty good (and, notably, means at least our current channels are safe to update), but it also avoids changing the interface with ChainMonitor
, which would be really nice.
lightning/src/ln/channelmanager.rs
Outdated
@@ -8272,9 +8565,15 @@ where | |||
} else { false }; | |||
let (htlcs_to_fail, monitor_update_opt) = try_chan_phase_entry!(self, | |||
chan.revoke_and_ack(&msg, &self.fee_estimator, &&logger, mon_update_blocked), chan_phase_entry); | |||
|
|||
let mut our_peer_storage = self.our_peer_storage.write().unwrap(); | |||
let _ = our_peer_storage.provide_secret(chan.context.channel_id(), chan.get_cur_counterparty_commitment_transaction_number() + 1, msg.per_commitment_secret); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (and all the other update methods) is actually too early. We allow ChannelMonitorUpdate
s to be persisted asynchronously, which means we could (a) get a new ChannelMonitorUpdate
, updating our peer state, (b) send that new peer state to our peers, (c) restart without finishing persisting the ChannelMonitorUpdate
(or ChannelManager
), then (d) on restart we're in a perfectly fine state (we haven't sent the new channel messages to our peer, only the new state) but we'll still think we're stale and panic.
Instead, we need to process changes to the channel(s) when we complete ChannelMonitorUpdate
s (removing them from the pending set in Channel
via monitor_updating_restored
in channelmanager.rs
's handle_monitor_update_completion
).
5bb20de
to
9acd198
Compare
} | ||
|
||
log_trace!(logger, "Received Peer Storage from {}", log_pubkey!(counterparty_node_id)); | ||
peer_state.peer_storage = msg.data.clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we actually should be willing to store a full 64KiB per peer. Sadly, LDK resilvers ChannelManager
very often (something we're working on reducing, but we're a ways from being able to change it), and currently store only ~ 1KiB per channel. If we allow all our peers to store 64KiB, we may ~50x our ChannelManager
size, causing a very substantial increase in I/O. For now, we can just limit the max size we accept (maybe 1KiB?) and ignore larger messages, but in a followup we can look at some other I/O interface for storing these in a separate K-V entry.
/// This update ID is used inside [`ChannelMonitorImpl`] to recognise | ||
/// that we're dealing with a [`StubChannelMonitor`]. Since we require some | ||
/// exceptions while dealing with it. | ||
pub const STUB_CHANNEL_UPDATE_IDENTIFIER: u64 = CLOSED_CHANNEL_UPDATE_ID - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets set a flag separately. I'm working on removing CLOSED_CHANNEL_UPDATE_ID
entirely.
/// This includes timestamp to compare between two given | ||
/// [`OurPeerStorage`] and version defines the structure. | ||
#[derive(Clone, PartialEq, Eq)] | ||
pub struct OurPeerStorage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets move this into its own file.
/// [`ChannelMonitorUpdateStep::LatestCounterpartyCommitmentTXInfo`] | ||
pub(crate) fn update_state_from_monitor_update(&mut self, cid: ChannelId, monitor_update: ChannelMonitorUpdate) -> Result<(),()> { | ||
for update in monitor_update.updates.iter() { | ||
match update { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this logic move into StubChannel
instead?
pub(crate) features: ChannelTypeFeatures, | ||
} | ||
|
||
impl StubChannel { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To better match LDK terminology, should this be a StubChannelMonitor
rather than a StubChannel
?
|
||
let persist_res = self | ||
.persister | ||
.persist_new_channel(stub_channel_monitor.get_funding_txo().0, &stub_channel_monitor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is gonna fail if we already have a monitor for this channel, but we should be trying to update that monitor. We may have a somewhat random mix of real (possibly stale) monitors and peer-storage (possibly stale) stubs, and we ideally need to merge all the data we have to be able to recover in the best way possible.
lightning/src/ln/fundrecoverer.rs
Outdated
funding_txid_u32.wrapping_add(best_height.unwrap_or_default()) | ||
}; | ||
|
||
let partition_factor = if channel_count < 15 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think we need to bother with this optimization in the recovery stuff.
lightning/src/ln/fundrecoverer.rs
Outdated
let funding_outpoints = hash_set_from_iter(self.monitors.read().unwrap().keys().cloned()); | ||
let channel_count = funding_outpoints.len(); | ||
for funding_outpoint in funding_outpoints.iter() { | ||
let monitor_lock = self.monitors.read().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice if a lot of the code in this file were DRYd with the code elsewhere.
} | ||
|
||
let mut monitors = self.monitors.write().unwrap(); | ||
let entry = match monitors.entry(stub_channel_monitor.get_funding_txo().0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somehow we need to replay blockchain data after we add a new monitor here. The flow needs to be (a) load monitor(s) from peer state, (b) replay any chain parts we need on the new monitor (c) persist it.
}; | ||
|
||
let persist_res = self | ||
.persister |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we persist a stub monitor here, the ChannelManager
should fail to load. This should be pretty doable - just have the ChannelManager
check for the stub flag on any ChannelMonitor
s in read
and refuse to load if any are set.
324f57c
to
fe6e83c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, only one major question and a handful of smaller comments. I think this is structurally correct aside from the one major question, so it may be worth starting to clean this up a bit - new files need to be rustfmt
'd (eg by running contrib/run-rustfmt.sh
), all files should be tab-indented, git commits should follow a logical progression and not change things introduced in earlier commits, they should have details about why we're doing what we're doing (see https://cbea.ms/git-commit/).
/// This update ID is used inside [`ChannelMonitorImpl`] to recognise | ||
/// that we're dealing with a [`StubChannelMonitor`]. Since we require some | ||
/// exceptions while dealing with it. | ||
pub const STUB_CHANNEL_UPDATE_IDENTIFIER: u64 = core::u64::MAX - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use CLOSED_CHANNEL_UPDATE_ID
? We already have handling for treating that as "final" which it makes sense to reuse wholly.
@@ -4253,6 +4393,9 @@ impl<Signer: EcdsaChannelSigner> ChannelMonitorImpl<Signer> { | |||
if *idx == input.previous_output.vout { | |||
#[cfg(test)] | |||
{ | |||
if self.latest_update_id == STUB_CHANNEL_UPDATE_IDENTIFIER { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like its working around a bug but only in tests? Probably should delete it and fix the bug in prod too :)
lightning/src/events/mod.rs
Outdated
@@ -2298,6 +2298,13 @@ impl MaybeReadable for Event { | |||
} | |||
} | |||
|
|||
#[derive(Clone, Debug, PartialEq, Eq)] | |||
pub enum RecoveryEvent { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs docs :)
lightning/src/events/mod.rs
Outdated
@@ -2298,6 +2298,13 @@ impl MaybeReadable for Event { | |||
} | |||
} | |||
|
|||
#[derive(Clone, Debug, PartialEq, Eq)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can probably just move to fundrecoverer.rs
- its only used there there's no reason for it to be in the global events
module.
lightning/src/events/mod.rs
Outdated
@@ -2630,6 +2654,10 @@ pub trait EventsProvider { | |||
fn process_pending_events<H: Deref>(&self, handler: H) where H::Target: EventHandler; | |||
} | |||
|
|||
pub trait RecoverEventsProvider: EventsProvider { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only (ever going to be) implemented by one struct so I'm not sure its worth having a trait. We could have a trait if we use it to force the user to implement both RecoveryEvent
and Event
handling in one struct (to ensure they don't forget to do both), but then we'd have to un-implement EventsProvider
on FundRecoverer
. Not really sure its worth it, users have to not forget to fetch events...
lightning/src/ln/channelmanager.rs
Outdated
Some(mutex) => mutex, | ||
None => { | ||
log_debug!(logger, "Not able to find peer_state for the counterparty {}, channelId {}", log_pubkey!(ps_channel.counterparty_node_id), ps_channel.channel_id); | ||
panic!("Found a missing channel {} through peer storage", ps_channel.channel_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its not a major error worth panicking over, it may just indicate that the peer gave us an old state of ours. Same goes for the below copy.
lightning/src/ln/fundrecoverer.rs
Outdated
use core::cell::RefCell; | ||
use core::ops::Deref; | ||
|
||
// Re-export this for use in the public API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
|
||
fn handle_channel_reestablish( | ||
&self, _their_node_id: PublicKey, _msg: &msgs::ChannelReestablish, | ||
) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to respond with a ChannelReestablish
with the dummy values from above to ask the peer to FC.
lightning/src/ln/fundrecoverer.rs
Outdated
); | ||
|
||
let per_peer_state = self.per_peer_state.read().unwrap(); | ||
if let Some(peer_state_mutex) = per_peer_state.get(&counterparty_node_id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this go in the peer_connected
handler?
lightning/src/ln/fundrecoverer.rs
Outdated
if holder.monitor.get_min_seen_secret() | ||
> stub_channel_monitor.get_min_seen_secret() | ||
{ | ||
holder.monitor.merge_commitment_secret(stub_channel_monitor); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just call this on all monitors. It may be that the ChannelMonitor
we do have is stale (but is not a stub) and we have more revocation secrets here than we used to.
Introduce the 'ProvidePeerBackupStorage' feature to enable nodes to distribute and store peer storage backups for channel partners. This functionality enhances resilience by ensuring critical peer data is backed up and can be retrieved if needed. - Added 'ProvidePeerBackupStorage' to the 'InitContext' & 'NodeContext'. - Set feature bit for this feature inside 'provide_init_features()'
Introduce message types and handlers to enable the exchange of peer storage data between nodes. PeerStorageMessage: Used to send encrypted backups to peers. YourPeerStorageMessage: Used to return a peer's stored data upon reconnection. - Define two new message types: PeerStorageMessage and YourPeerStorageMessage. - Implement handlers for these messages in ChannelMessageHandler. - Add SendPeerStorageMessage and SendYourPeerStorageMessage to MessageSendEvent.
This commit introduces the handling and persistence of PeerStorage messages on a per-peer basis. The peer storage is stored within the PeerState to simplify management, ensuring we do not need to remove it when there are no active channels with the peer. Key changes include: - Add PeerStorage to PeerState for persistent storage. - Implement internal_peer_storage to manage PeerStorage and its updates. - Add resend logic in peer_connected() to resend PeerStorage before sending the channel reestablish message upon reconnection. - Update PeerState's write() and read() methods to support PeerStorage persistence.
Add get_peer_storage_key method to derive a 32-byte encryption key for securing Peer Storage. This method utilizes HKDF with the node's secret key as input and a fixed info string to generate the encryption key. - Add 'get_peer_storage_key' to NodeSigner. - Implement 'get_peer_storage_key' for KeysManager & PhantomKeysManager.
Introduce the OurPeerStorage struct to manage serialized channel data for peer storage backups. This struct facilitates the distribution of peer storage to channel partners and includes versioning and timestamping for comparison between retrieved peer storage instances. - Add the OurPeerStorage struct with fields for version, timestamp, and serialized channel data (ser_channels). - Implement methods to encrypt and decrypt peer storage securely. - Add functionality to update channel data within OurPeerStorage.
To enable ChainMonitor sending peer storage to channel partners whenever a new block is added, a new message handler, `SendingOnlyMessageHandler`, has been introduced. This allows the `ChainMonitor` to handle the peer storage distribution. Key changes: - Introduce the SendingOnlyMessageHandler trait to facilitate peer storage distribution. - Add SendingOnlyMessageHandler into the MessageHandler. - Implement SendingOnlyMessageHandler for ChainMonitor and IgnoringMessageHandler. - Process SendingOnlyMessageHandler events inside process_events().
Everytime a new block is added we send PeerStorage to all of our channel partner. - Add our_peer_storage and our_peerstorage_encryption_key to ChainMonitor - Write send_peer_storage() and send it to all channel partners whenever a new block is added - Write utility function for ChannelMonitor::write() to give an option to write ChannelMonitor for peerstorage - Append some channel information to the serialized ChannelMonitor so that we can build ChannelSigner when we call handle_your_peer_storage()
Ensure ChannelManager properly handles cases where a stale or missing ChannelMonitor is detected. - Write internal_your_peer_storage to verify if a ChannelMonitor is stale or missing. - Update ChannelManager::read to panic if a stale ChannelMonitor is encountered during loading. - Add STUB_CHANNEL_UPDATE_IDENTIFIER to identify stub channel monitors constructed from peer storage. - Make some exceptions for StubChannelMonitors.
Introduce FundRecoverer, a utility to recover funds from PeerStorage while running the node in offline mode. - Define Fundrecoverer and Implement MessageSendEventsProvider, chain::Confirm, ChannelMessageHandler, events::EventsProvider. - Write Handler for your_peer_storage, peer_connected, channel_reestablish. - Implement functions to process new blocks defined in chain::Confirm. - Define RecoveryEvent which would be used to notify upstream to rescan the blockchain on the go. - Write watch_dummy() to persist the StubChannelMonitor and load outputs to watch. - Define ReadUtilOpt for defining parameters to be passed to ChannelMonitor::read_util in case of stub ChannelMonitors and otherwise. - Write utility for ChannelMonitor::read() to read ChannelMonitors from PeerStorage.
Introduce test_peer_storage to verify fund recovery through PeerStorage. This test simulates a scenario where two nodes establish a channel, one node is formatted, and funds are recovered using FundRecoverer. - Add test_peer_storage in functional_tests.rs to validate fund recovery via PeerStorage. - Implement clear_watched_txn_and_output to reset watched_outputs and watched_txn for recovery testing. - Add handler for PeerStorage and YourPeerStorage in get_chan_reestablish_msgs!(), because we exchange these msgs just before exchanging ChannelReestablish.
Introduce test_peer_storage_on_revoked_txn to validate fund recovery in the event of a revoked transaction. This test establishes a channel between two nodes, simulates one node attempting to cheat by broadcasting a revoked state on-chain, and demonstrates recovering funds using PeerStorage by generating and broadcasting a penalty transaction. - Add test_peer_storage_on_revoked_txn to test the recovery of funds via a penalty transaction created using data from PeerStorage.
fe6e83c
to
2035aaa
Compare
This would enable nodes to distribute small blobs among their peers, which could be retrieved to resume or force-close the channel operation upon data corruption.
Elaborated in detail here
It's far from complete right now, but please let me know if I am making any mistakes because I am very new to both Rust and LDK :)