Skip to content

Commit

Permalink
Merge branch 'master' into public-keys-endpoint
Browse files Browse the repository at this point in the history
  • Loading branch information
courtneyeh committed Sep 26, 2023
2 parents 1219b2f + 20d9f55 commit 2574ef7
Show file tree
Hide file tree
Showing 78 changed files with 10,435 additions and 14,551 deletions.
2 changes: 1 addition & 1 deletion docs/get-started/connect/testnet.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ See the list of [Goerli faucets](https://github.com/eth-clients/goerli#meta-data

:::note

If you're unable to get ETH using the faucet, you can ask for help on the [EthStaker Discord](https://discord.io/ethstaker).
If you're unable to get ETH using the faucet, you can ask for help on the [EthStaker Discord](https://discord.gg/ethstaker).

:::

Expand Down
2 changes: 1 addition & 1 deletion docs/how-to/configure/tls.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,5 +77,5 @@ In the command:
[Teku's password-protected PKCS12 or JKS keystore and password file]: ../../tutorials/configure-external-signer-tls.md#teku-keystore-and-password-file
[Web3Signer's password-protected PKCS12 or JKS truststore and password file]: ../../tutorials/configure-external-signer-tls.md#2-create-the-truststore-and-password-file
[Hyperledger Besu]: https://besu.hyperledger.org/stable/public-networks/get-started/install
[Slashing protection]: https://docs.web3signer.consensys.net/en/latest/Concepts/Slashing-Protection/
[Slashing protection]: https://docs.web3signer.consensys.net/en/latest/concepts/slashing-protection/
[configure your slashing protection database]: https://docs.web3signer.consensys.net/en/latest/HowTo/Configure-Slashing-Protection/
8 changes: 8 additions & 0 deletions docs/how-to/troubleshoot/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "Troubleshoot",
"position": 4,
"link": {
"type": "generated-index",
"slug": "/how-to/troubleshoot"
}
}
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
---
title: Troubleshoot
description: Solve common problems encountered with Teku.
sidebar_position: 14
---

# Troubleshoot
# General issues

## Out of memory error

Expand All @@ -16,7 +15,7 @@ To fix this, you can try [setting a maximum heap size].

If Teku fails to start with a `P2P Port 9000 (TCP/UDP) is already in use. Check for other processes using this port.` error, it means that Teku is trying to use a network port that is already in use.

For example, Teku and Lighthouse both use port 9000 by default for P2P traffic. You can change Teku's default port number with the [`--p2p-port`](../reference/cli/index.md#p2p-port) option.
For example, Teku and Lighthouse both use port 9000 by default for P2P traffic. You can change Teku's default port number with the [`--p2p-port`](../../reference/cli/index.md#p2p-port) option.

## Unable to lock a keystore file

Expand All @@ -27,7 +26,7 @@ Teku uses a file locking mechanism for the keystores to prevent two validator cl
To resolve this issue, try the one of the following:

- Set the permissions of the directory holding the keystores so that it is writable by Teku.
- Set [`--validators-keystore-locking-enabled`](../reference/cli/index.md#validators-keystore-locking-enabled) to `false` to disable the locking functionality.
- Set [`--validators-keystore-locking-enabled`](../../reference/cli/index.md#validators-keystore-locking-enabled) to `false` to disable the locking functionality.

:::warning

Expand All @@ -44,7 +43,7 @@ Teku uses a file locking mechanism for the keystores to prevent two validator cl
To resolve this issue, try the one of the following:

- Manually remove the lock files that are created alongside your keystore files, with `.lock` appended to the filename. Take care not to delete your keystores.
- Set [`--validators-keystore-locking-enabled`](../reference/cli/index.md#validators-keystore-locking-enabled) to `false` to disable the locking functionality.
- Set [`--validators-keystore-locking-enabled`](../../reference/cli/index.md#validators-keystore-locking-enabled) to `false` to disable the locking functionality.

:::warning

Expand Down Expand Up @@ -87,7 +86,7 @@ If all recent attestations are marked as missed, check the following:

Check the logs when Teku started for the line, `teku-status-log | Loaded N Validators: <validator_pubkey>[, <validator_pubkey>]`, where `N` is the number of expected validators. Each validator's truncated public key is also listed.

If the validator did not load, check for any errors loading the validator, and that the [`--validators-keys`](../reference/cli/index.md#validators-keys) option is correct.
If the validator did not load, check for any errors loading the validator, and that the [`--validators-keys`](../../reference/cli/index.md#validators-keys) option is correct.

- **Is the beacon node still syncing?**

Expand All @@ -101,7 +100,7 @@ If all recent attestations are marked as missed, check the following:

Each validator that you run prints the message, `teku-validator-log | Validator *** Published attestation Count: 1, Slot: 48539, Root: 5e1bf5..cee8` once each epoch. If you do not see this for your validator then check that it loaded correctly.

To see this message, ensure [`log-include-validator-duties-enabled`](../reference/cli/index.md#log-include-validator-duties-enabled) is `true`.
To see this message, ensure [`log-include-validator-duties-enabled`](../../reference/cli/index.md#log-include-validator-duties-enabled) is `true`.

- **Do you have peers?**

Expand Down Expand Up @@ -145,7 +144,7 @@ The shell does not see the tilde (~) in the command. To fix this, omit the equal

<!-- links -->

[Ensure your local network is configured correctly]: find-and-connect/improve-connectivity.md
[Ensure your local network is configured correctly]: ../find-and-connect/improve-connectivity.md
[EIP-2335]: https://eips.ethereum.org/EIPS/eip-2335
[slashed]: ../concepts/slashing-protection.md
[setting a maximum heap size]: ../get-started/manage-memory.md
[slashed]: ../../concepts/slashing-protection.md
[setting a maximum heap size]: ../../get-started/manage-memory.md
184 changes: 184 additions & 0 deletions docs/how-to/troubleshoot/network.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
---
description: Solve common networking problems encountered with Teku.
sidebar_position: 15
---

# Network issues

## Speed up sync time

Use [checkpoint sync](../../get-started/checkpoint-start.md) to sync Teku from a recent finalized checkpoint, bypassing
the need to sync from genesis and enabling a quick synchronization process within minutes. To do this, use the
[`--initial-state`](../../reference/cli/index.md#initial-state) CLI option which accepts a URL or file that provides a recent
finalized `BeaconState`. Any synchronized beacon node can provide this from the standard API, and you can view
[the list of public sources](https://eth-clients.github.io/checkpoint-sync-endpoints/).

The [`--initial-state`](../../reference/cli/index.md#initial-state) option is only used when you first create a database. To
restart an existing sync process with checkpoint sync, do the following:

- Stop the current Teku sync process
- Delete the `beacon` directory under your [data path](../../reference/cli/index.md#data-base-path-data-path)
- Start teku with the [`--initial-state`](../../reference/cli/index.md#initial-state) option


Teku will sync within a few minutes, and downloads historic blocks in the background, so it can
help any peers that are syncing from genesis. Teku can run validators and attest while while historic blocks are being downloaded.

## Locate the multiaddress and/or ENR of a Teku beacon node

Teku outputs its Ethereum Name Record (ENR) to the logs at startup. You can also access the info via the API:

```bash
curl "http://127.0.0.1:5051/eth/v1/node/identity" | jq
```

You can decode the ENR by using the [ENR Viewer website](https://enr-viewer.com/).

## Resolve peering issues

### Peer connection issues

By default, Teku attempts to get 100 peers. You can increase the number of peers to improve performance, but this does
lead to increased network traffic and a higher number of messages requiring validation.

Teku's attempt to connect with peers is influenced by two CLI options: [`--p2p-peer-lower-bound`](../../reference/cli/index.md#p2p-peer-lower-bound) (default is 64)
and [`--p2p-peer-upper-bound`](../../reference/cli/index.md#p2p-peer-upper-bound) (default is 100). If you notice a
decline in your beacon node's participation after reducing these parameters, consider increasing them to enhance performance.


### Firewall connection issues

To determine the number of inbound and outbound peers via the beacon node's REST API, you can send a request to the peers
endpoint. This gathers data and organizes it based on the direction, either inbound or outbound.

```bash
curl http://127.0.0.1:5051/eth/v1/node/peers |jq '.data | group_by(.direction)[] | {direction: .[0].direction, count: length}'
```

If only outbound peers are displayed, it indicates that peers cannot connect to your infrastructure from the outside.
Networks typically have a firewall at the entry point (router / modem / gateway) that blocks incoming data by default.

To resolve this, update the firewall to include a rule that allows access to the [`--p2p-port`](../../reference/cli/index.md#p2p-port) (9000 by default)
for both `UDP` and `TCP` traffic. Subsequently, forward this port (TCP and UDP) to the internal IP address of the machine running the
beacon node. Some operating systems also have local firewalls that should be updated to permit communication through this port.

:::info

View the [Prysm guide](https://docs.prylabs.network/docs/prysm-usage/p2p-host-ip/) for more information on this topic, but you need to substitute your `--p2p-port` (9000 by default) for the port numbers.

:::

### Advertised IP address issues

A possible reason for incoming peers being unable to connect could be an incorrect address specified using the
[`--p2p-advertised-ip`](../../reference/cli/index.md#p2p-advertised-ip) option. Teku auto-detects the address to use by
default, so most users won't need to use this option. If you're experiencing issues with incoming peers despite having
correct firewall and forwarding settings, this could potentially be the cause.


### Network gateway issues

A potential reason for incoming peers not being able to connect could be the use of a different port on your network
gateway (router or modem).
This usually happens because only one service can listen on a port. Therefore, if you're running multiple beacon nodes, you'll
need to open multiple ports on your gateway. The simplest solution is to use the same port on your gateway as specified
in your [`--p2p-port`](../../reference/cli/index.md#p2p-port) (9000 by default). However, if necessary, users can also
update the advertised port using the [`--p2p-advertised-port`](../../reference/cli/index.md#p2p-advertised-port) command.

## Resolve poor attestation performance

Troubleshooting poor attestation performance is complicated, and the solution requires you to identify the root cause.

[This video](https://www.symphonious.net/2020/09/08/exploring-eth2-attestation-inclusion/) is a little old, but the general picture is still relevant.

Common issues include:

* **The CPU is overloaded and Teku is lagging**. Monitor CPU stats, and watch the terminal for frequent `regenerating state`
messages, common during Teku's struggle. In this context, enabling [`--p2p-subscribe-all-subnets`](../../reference/cli/index.md#p2p-subscribe-all-subnets-enabled) can worsen the situation by raising CPU usage. A typical problem arises when JVM lacks adequate heap allocation, causing
aggressive garbage collection. Ensure an environment variable like `JAVA_OPTS=-Xmx5g` is set, with
`5g` (five gigabytes of heap) as an optimal value; `4g` is acceptable, while anything much lower may lead to problems.

* **Time sync on your server is poor**. Ensure `ntpd` or `chrony` is configured correctly.

* **Low numbers of peers, or poor quality peers**. Refer to the [peering troubleshooting topic](#how-many-peers-do-i-need-or-other-peering-issues)
for more information to resolve this.

* **Poor internet speed**. An example is someone was on an ADSL link with only about 2.5 Mbps upstream which led to
misses, typically anything over 10 Mbps upstream is acceptable.


## Address missing attestations or non-inclusion issues

* No peers might have been present on the attestation subnet. Check for a log message when attempting to
publish without subscribed peers: `Failed to publish ... for slot ... due to missing peers on the required gossip topic`.
* Several factors could contribute, such as delayed blocks past your inclusion slot causing ripple effects. Thus, examining
epochs where your attestation was scheduled and checking for late block import warnings would be beneficial.
* Also, consider specific times of day and concurrent network activities. It's possible that message transmission could
be hindered by factors like bandwidth limitations.

## Invalid signer public key configuration

You may see log error messages similar to:

```bash
Caused by: java.lang.IllegalArgumentException: Expected 48 bytes but received 58.
```

This arises if `validators-external-signer-public-keys` is in the config file without proper quotation for public keys.
In YAML, `0x` prefixed values are treated as numbers, leading the parser to convert them to an unexpected binary format
in Teku. Previous Teku versions had a YAML parser that didn't perform this conversion, making both quoted and unquoted
forms functional.

**Incorrect:**
```yaml
validators-external-signer-public-keys:
- 0x8f9335f7d6b19469d5c8880df50bf41c01f476411d5b69a8b121255347f1c0b8400ba31a63010b229080240589ad2423
- 0xb3f3faa8dfa1030714559b95cb0107e53c9ee9c6f2b4b11f29e60417dbc4462052ff2d2dbbe98d808e3093858a3acdcc
- 0xb2f1e6c00c6716d4cd5cb02b42678ff481e3ae1525cdfc33e4a1711eeb2878da10ebeacdcdc2ef2049410fc60fe5cfe5
- 0xb7d6cb9ce7397c33b89ec57de0de383c7c294687b8963f92cc60f59bb1de46c56623cd24c9cc1e407db92d1a79920887
- 0xaf3eab6962987321bdf81e7a10239b91316c643cca64babe81d68e9f9030a6a7b91681168df5a02a9ac3433b8332a712
```
**Correct:**
```yaml
validators-external-signer-public-keys:
- "0x8f9335f7d6b19469d5c8880df50bf41c01f476411d5b69a8b121255347f1c0b8400ba31a63010b229080240589ad2423"
- "0xb3f3faa8dfa1030714559b95cb0107e53c9ee9c6f2b4b11f29e60417dbc4462052ff2d2dbbe98d808e3093858a3acdcc"
- "0xb2f1e6c00c6716d4cd5cb02b42678ff481e3ae1525cdfc33e4a1711eeb2878da10ebeacdcdc2ef2049410fc60fe5cfe5"
- "0xb7d6cb9ce7397c33b89ec57de0de383c7c294687b8963f92cc60f59bb1de46c56623cd24c9cc1e407db92d1a79920887"
- "0xaf3eab6962987321bdf81e7a10239b91316c643cca64babe81d68e9f9030a6a7b91681168df5a02a9ac3433b8332a712"
```
## Teku crashes with SIGILL
The BLST library might erroneously use the optimized library version instead of the portable one. This could stem from CPU
auto-detection errors, in which case, obtaining the CPU details from `/proc/cpuinfo` on Linux or `/usr/sbin/sysctl -a` on macOS
will help us to improve it. Alternatively, users might have intentionally set BLST to optimal.

You can specifically request the portable version of BLST (overriding CPU detection) with the following:

```bash
JAVA_OPTS="-Dteku.portableBlst=true"
```

If the user has already set `-Dteku.portableBlst=false` it should be changed to `true`.

## Force Teku to use the optimized BLST library

Check the Teku logs at startup for `Using optimized BLST library` if it was able to detect a compatible CPU, or
`Using portable BLST library` if it could not.

You can force Teku to use the optimized version by setting the environment variable `TEKU_OPTS="-Dteku.portableBlst=false"`.
If you're already setting `TEKU_OPTS` or `JAVA_OPTS`, append `-Dteku.portableBlst=false` to the existing variable. If
you use the optimized library on a CPU that doesn't support it, Teku will crash with a `SIGILL`, in which case you should
switch back to the portable version (`TEKU_OPTS="-Dteku.portableBlst=true"`).

## Configure an archive node

Set [`--data-storage-mode`](../../reference/cli/index.md#data-storage-mode) to `archive`, and provide an
[`--initial-state`](../../reference/cli/index.md#initial-state), you can also use
[`--reconstruct-historic-states`](../../reference/cli/index.md#reconstruct-historic-states) to rebuild
all the old states once blocks have been downloaded.

It will take a while to build up the node, but you'll be able to access all state an block information back to genesis
after it is completed.
2 changes: 1 addition & 1 deletion docs/reference/cli/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -1911,7 +1911,7 @@ p2p-subscribe-all-subnets-enabled: true

Forces the beacon node to stay subscribed to all subnets regardless of the number of validators. The default is `false`.

When set to `false` and running a low number of validators, Teku subscribes and unsubscribes from subnets as needed for the running validators.
When set to `false`, Teku subscribes to two persistent subnets regardless of the number of validators. Teku also subscribes and unsubscribes from subnets as needed for the running validators.

This option is primarily for users running an external validator client and load balancing it across multiple beacon nodes. Without this flag, depending on how requests are load balanced, the beacon nodes may not have subscribed to the required subnets and be unable to produce aggregates.

Expand Down
2 changes: 1 addition & 1 deletion docs/tutorials/configure-external-signer-tls.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,4 +175,4 @@ teku --network=goerli \
[Hyperledger Besu]: https://besu.hyperledger.org/development/public-networks/get-started/install
[Infura]: https://infura.io/
[ETH1 Goerli node]: https://besu.hyperledger.org/development/public-networks/get-started/start-node#run-a-node-on-goerli-testnet
[Web3Signer slashing protection]: https://docs.web3signer.consensys.net/en/latest/Concepts/Slashing-Protection/
[Web3Signer slashing protection]: https://docs.web3signer.consensys.net/en/latest/concepts/slashing-protection/
7 changes: 5 additions & 2 deletions docusaurus.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ const config = {
routeBasePath: "/",
path: "./docs",
includeCurrentVersion: true,
lastVersion: "23.8.0",
lastVersion: "23.9.1",
versions: {
//defaults to the ./docs folder
// using 'development' instead of 'next' as path
Expand All @@ -48,8 +48,11 @@ const config = {
path: "development",
},
//the last stable release in the versioned_docs/version-stable
"23.9.1": {
label: "stable (23.9.1)",
},
"23.8.0": {
label: "stable (23.8.0)",
label: "23.8.0",
},
},
// @ts-ignore
Expand Down
Loading

0 comments on commit 2574ef7

Please sign in to comment.