Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

authority-discovery: Populate DHT records with public listen addresses #6298

Merged
merged 28 commits into from
Nov 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
e42d31d
cli: Revert warning
lexnv Oct 30, 2024
fd68062
authority-discovery: Obtain global addresses from listen addresses
lexnv Oct 30, 2024
36a4a1e
authority-discovery: Chain an deduplicate addresses
lexnv Oct 30, 2024
f1c7ed6
authority-discovery: Refactor `address_without_p2p` into a dedicated fn
lexnv Oct 30, 2024
5ca1f8b
authority-discovery: Cap max addresses to publish
lexnv Oct 30, 2024
0b091ef
authority-discovery: Bump max cached addresses to 16
lexnv Oct 30, 2024
6bbfce0
Add prdoc
lexnv Oct 30, 2024
58a8432
authority-discovery: Warn in authority discovery about non public
lexnv Nov 1, 2024
9c0d40d
tests: Adress testing
lexnv Nov 1, 2024
0bb4526
authority-disovery: Replace validator with authority
lexnv Nov 1, 2024
9b34c65
prdoc: Add more details
lexnv Nov 1, 2024
f29ebcb
Merge remote-tracking branch 'origin/master' into lexnv/warn-dhtfix
lexnv Nov 1, 2024
1ce39e8
Update substrate/client/authority-discovery/src/worker.rs
lexnv Nov 1, 2024
d95fd51
Merge remote-tracking branch 'origin/master' into lexnv/warn-dhtfix
lexnv Nov 1, 2024
e6f329d
Fix clippy
lexnv Nov 1, 2024
c1c7739
Merge branch 'master' into lexnv/warn-dhtfix
lexnv Nov 4, 2024
7fca9ee
Update substrate/client/authority-discovery/src/worker.rs
lexnv Nov 4, 2024
cedf8fa
Update substrate/client/authority-discovery/src/worker.rs
lexnv Nov 4, 2024
7f6efce
Update substrate/client/authority-discovery/src/worker.rs
lexnv Nov 4, 2024
094b5f5
authority-discovery: Fix build
lexnv Nov 4, 2024
53a07d4
authority-discovery: Add docs for warn_public_addresses
lexnv Nov 4, 2024
9fb02e0
authority-discovery: Add AddressType wrapper for better logs
lexnv Nov 4, 2024
9fed62e
authority-discovery: Log error for all types on mismatched peerId
lexnv Nov 4, 2024
0a4e74c
Merge branch 'master' into lexnv/warn-dhtfix
lexnv Nov 4, 2024
eab484c
prdoc: Try to remove sc-cli for semvercheck
lexnv Nov 4, 2024
49f6d2e
primitives: Revert panicHook import to fix CI for now
lexnv Nov 4, 2024
e408fee
Merge branch 'master' into lexnv/warn-dhtfix
lexnv Nov 4, 2024
31b3595
Revert "primitives: Revert panicHook import to fix CI for now"
lexnv Nov 4, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions prdoc/pr_6298.prdoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
title: Populate authority DHT records with public listen addresses

doc:
- audience: [ Node Dev, Node Operator ]
description: |
This PR populates the authority DHT records with public listen addresses if any.
The change effectively ensures that addresses are added to the DHT record in the
following order:
1. Public addresses provided by CLI `--public-addresses`
2. Maximum of 4 public (global) listen addresses (if any)
3. Any external addresses discovered from the network (ie from `/identify` protocol)

While at it, this PR adds the following constraints on the number of addresses:
- Total number of addresses cached is bounded at 16 (increased from 10).
- A maximum number of 32 addresses are published to DHT records (previously unbounded).
- A maximum of 4 global listen addresses are utilized.

This PR replaces the following warning:
`WARNING: No public address specified, validator node may not be reachable.`
with a more descriptive one originated from the authority-discovery
mechanism itself: `No public addresses configured and no global listen addresses found`.

crates:
- name: sc-authority-discovery
bump: patch
176 changes: 125 additions & 51 deletions substrate/client/authority-discovery/src/worker.rs
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,13 @@ pub mod tests;
const LOG_TARGET: &str = "sub-authority-discovery";

/// Maximum number of addresses cached per authority. Additional addresses are discarded.
const MAX_ADDRESSES_PER_AUTHORITY: usize = 10;
const MAX_ADDRESSES_PER_AUTHORITY: usize = 16;

/// Maximum number of global listen addresses published by the node.
const MAX_GLOBAL_LISTEN_ADDRESSES: usize = 4;

/// Maximum number of addresses to publish in a single record.
const MAX_ADDRESSES_TO_PUBLISH: usize = 32;

/// Maximum number of in-flight DHT lookups at any given point in time.
const MAX_IN_FLIGHT_LOOKUPS: usize = 8;
Expand Down Expand Up @@ -174,6 +180,9 @@ pub struct Worker<Client, Block: BlockT, DhtEventStream> {

metrics: Option<Metrics>,

/// Flag to ensure the warning about missing public addresses is only printed once.
warn_public_addresses: bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment on what this does would be nice. I first thought this should be exposed, until I had read the code :D


role: Role,

phantom: PhantomData<Block>,
Expand Down Expand Up @@ -271,20 +280,7 @@ where
config
.public_addresses
.into_iter()
.map(|mut address| {
if let Some(multiaddr::Protocol::P2p(peer_id)) = address.iter().last() {
if peer_id != *local_peer_id.as_ref() {
error!(
target: LOG_TARGET,
"Discarding invalid local peer ID in public address {address}.",
);
}
// Always discard `/p2p/...` protocol for proper address comparison (local
// peer id will be added before publishing).
address.pop();
}
address
})
.map(|address| AddressType::PublicAddress(address).without_p2p(local_peer_id))
.collect()
};

Expand All @@ -309,6 +305,7 @@ where
addr_cache,
role,
metrics,
warn_public_addresses: false,
phantom: PhantomData,
last_known_records: HashMap::new(),
}
Expand Down Expand Up @@ -373,54 +370,92 @@ where
}
}

fn addresses_to_publish(&self) -> impl Iterator<Item = Multiaddr> {
fn addresses_to_publish(&mut self) -> impl Iterator<Item = Multiaddr> {
let local_peer_id = self.network.local_peer_id();
let publish_non_global_ips = self.publish_non_global_ips;

// Checks that the address is global.
let address_is_global = |address: &Multiaddr| {
address.iter().all(|protocol| match protocol {
// The `ip_network` library is used because its `is_global()` method is stable,
// while `is_global()` in the standard library currently isn't.
multiaddr::Protocol::Ip4(ip) => IpNetwork::from(ip).is_global(),
multiaddr::Protocol::Ip6(ip) => IpNetwork::from(ip).is_global(),
_ => true,
})
};

// These are the addresses the node is listening for incoming connections,
// as reported by installed protocols (tcp / websocket etc).
//
// We double check the address is global. In other words, we double check the node
// is not running behind a NAT.
// Note: we do this regardless of the `publish_non_global_ips` setting, since the
// node discovers many external addresses via the identify protocol.
let mut global_listen_addresses = self
.network
.listen_addresses()
.into_iter()
.filter_map(|address| {
address_is_global(&address)
.then(|| AddressType::GlobalListenAddress(address).without_p2p(local_peer_id))
})
.take(MAX_GLOBAL_LISTEN_ADDRESSES)
.peekable();

// Similar to listen addresses that takes into consideration `publish_non_global_ips`.
let mut external_addresses = self
.network
.external_addresses()
.into_iter()
.filter_map(|address| {
(publish_non_global_ips || address_is_global(&address))
.then(|| AddressType::ExternalAddress(address).without_p2p(local_peer_id))
})
.peekable();

let has_global_listen_addresses = global_listen_addresses.peek().is_some();
trace!(
target: LOG_TARGET,
"Node has public addresses: {}, global listen addresses: {}, external addresses: {}",
!self.public_addresses.is_empty(),
has_global_listen_addresses,
external_addresses.peek().is_some(),
);

let mut seen_addresses = HashSet::new();

let addresses = self
.public_addresses
.clone()
.into_iter()
.chain(self.network.external_addresses().into_iter().filter_map(|mut address| {
// Make sure the reported external address does not contain `/p2p/...` protocol.
if let Some(multiaddr::Protocol::P2p(peer_id)) = address.iter().last() {
if peer_id != *local_peer_id.as_ref() {
error!(
target: LOG_TARGET,
"Network returned external address '{address}' with peer id \
not matching the local peer id '{local_peer_id}'.",
);
debug_assert!(false);
}
address.pop();
}

if self.public_addresses.contains(&address) {
// Already added above.
None
} else {
Some(address)
}
}))
.filter(move |address| {
if publish_non_global_ips {
return true
}

address.iter().all(|protocol| match protocol {
// The `ip_network` library is used because its `is_global()` method is stable,
// while `is_global()` in the standard library currently isn't.
multiaddr::Protocol::Ip4(ip) if !IpNetwork::from(ip).is_global() => false,
multiaddr::Protocol::Ip6(ip) if !IpNetwork::from(ip).is_global() => false,
_ => true,
})
})
.chain(global_listen_addresses)
.chain(external_addresses)
// Deduplicate addresses.
.filter(|address| seen_addresses.insert(address.clone()))
.take(MAX_ADDRESSES_TO_PUBLISH)
.collect::<Vec<_>>();

if !addresses.is_empty() {
debug!(
target: LOG_TARGET,
"Publishing authority DHT record peer_id='{local_peer_id}' with addresses='{addresses:?}'",
);

if !self.warn_public_addresses &&
self.public_addresses.is_empty() &&
!has_global_listen_addresses
{
self.warn_public_addresses = true;

error!(
target: LOG_TARGET,
"No public addresses configured and no global listen addresses found. \
Authority DHT record may contain unreachable addresses. \
Consider setting `--public-addr` to the public IP address of this node. \
This will become a hard requirement in future versions for authorities."
);
}
}

// The address must include the local peer id.
Expand All @@ -437,7 +472,8 @@ where
let key_store = match &self.role {
Role::PublishAndDiscover(key_store) => key_store,
Role::Discover => return Ok(()),
};
}
.clone();

let addresses = serialize_addresses(self.addresses_to_publish());
if addresses.is_empty() {
Expand Down Expand Up @@ -946,6 +982,44 @@ where
}
}

/// Removes the `/p2p/..` from the address if it is present.
#[derive(Debug, Clone, PartialEq, Eq)]
enum AddressType {
/// The address is specified as a public address via the CLI.
PublicAddress(Multiaddr),
/// The address is a global listen address.
GlobalListenAddress(Multiaddr),
/// The address is discovered via the network (ie /identify protocol).
ExternalAddress(Multiaddr),
}

impl AddressType {
/// Removes the `/p2p/..` from the address if it is present.
///
/// In case the peer id in the address does not match the local peer id, an error is logged for
/// `ExternalAddress` and `GlobalListenAddress`.
fn without_p2p(self, local_peer_id: PeerId) -> Multiaddr {
// Get the address and the source str for logging.
let (mut address, source) = match self {
AddressType::PublicAddress(address) => (address, "public address"),
AddressType::GlobalListenAddress(address) => (address, "global listen address"),
AddressType::ExternalAddress(address) => (address, "external address"),
};

if let Some(multiaddr::Protocol::P2p(peer_id)) = address.iter().last() {
if peer_id != *local_peer_id.as_ref() {
error!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line for sure breaks this test:

alice: count of log lines containing "error" is 0 within 10 seconds

Does it need to be error? If so, I can adjust the condition in test to be 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change it back to warning for now, sorry missed that 🙏

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error is coming from the validator setup:

2024-11-05 20:07:15.454	
2024-11-05 18:07:15.454 ERROR tokio-runtime-worker sub-authority-discovery: No public addresses configured and no global listen addresses found. Authority DHT record may contain unreachable addresses. Consider setting `--public-addr` to the public IP address of this node. This will become a hard requirement in future versions for authorities.    

Will adjust the testing instead to accommodate this 🙏

target: LOG_TARGET,
"Network returned '{source}' '{address}' with peer id \
not matching the local peer id '{local_peer_id}'.",
);
}
address.pop();
}
address
}
}

/// NetworkProvider provides [`Worker`] with all necessary hooks into the
/// underlying Substrate networking. Using this trait abstraction instead of
/// `sc_network::NetworkService` directly is necessary to unit test [`Worker`].
Expand Down
4 changes: 2 additions & 2 deletions substrate/client/authority-discovery/src/worker/tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1018,7 +1018,7 @@ fn addresses_to_publish_adds_p2p() {
));

let (_to_worker, from_service) = mpsc::channel(0);
let worker = Worker::new(
let mut worker = Worker::new(
from_service,
Arc::new(TestApi { authorities: vec![] }),
network.clone(),
Expand Down Expand Up @@ -1056,7 +1056,7 @@ fn addresses_to_publish_respects_existing_p2p_protocol() {
});

let (_to_worker, from_service) = mpsc::channel(0);
let worker = Worker::new(
let mut worker = Worker::new(
from_service,
Arc::new(TestApi { authorities: vec![] }),
network.clone(),
Expand Down
12 changes: 1 addition & 11 deletions substrate/client/cli/src/commands/run_cmd.rs
Original file line number Diff line number Diff line change
Expand Up @@ -201,17 +201,7 @@ impl CliConfiguration for RunCmd {
}

fn network_params(&self) -> Option<&NetworkParams> {
let network_params = &self.network_params;
let is_authority = self.role(self.is_dev().ok()?).ok()?.is_authority();
if is_authority && network_params.public_addr.is_empty() {
eprintln!(
"WARNING: No public address specified, validator node may not be reachable.
Consider setting `--public-addr` to the public IP address of this node.
This will become a hard requirement in future versions."
);
Comment on lines -207 to -211
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I would keep this warning, because there is no guarantee we will have a global listen address nor that the external address will be discovered correctly (with libp2p address translation). Especially since there is a practice of "hiding" validators behind firewalls / NATs.

}

Some(network_params)
Some(&self.network_params)
}

fn keystore_params(&self) -> Option<&KeystoreParams> {
Expand Down
Loading