Skip to content

Commit

Permalink
clean up
Browse files Browse the repository at this point in the history
  • Loading branch information
insumity committed Sep 6, 2023
1 parent 9a673cc commit a2d180e
Showing 1 changed file with 73 additions and 58 deletions.
131 changes: 73 additions & 58 deletions docs/docs/adrs/adr-013-equivocation-slashing.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,16 +14,16 @@ Proposed
We present some approaches on how we can slash a validator on the provider chain for
an equivocation performed on the consumer chain. Currently, we can receive [evidence of equivocation](https://github.com/cosmos/interchain-security/pull/1232),
but we do not have functionality to slash the misbehaving validator on the provider chain.
In what follows, we first explain how slashing is performed in the case of a single chain, to show why slashing on
the provider chain for a consumer equivocation is a challenging problem.
In what follows, we first explain how slashing is performed on a single chain, to show why slashing on
the provider chain for a consumer equivocation is a challenging problem before proposing a potential solution.

### Single-chain slashing
Slashing is implemented across the [slashing](https://docs.cosmos.network/v0.50/modules/slashing)
and [staking](https://docs.cosmos.network/v0.50/modules/staking#msgcreatevalidator-1)
and [staking](https://docs.cosmos.network/v0.50/modules/staking)
modules.
The slashing module’s [keeper.go](https://github.com/cosmos/cosmos-sdk/blob/5621d9d80736d6025c0f73263947e543fe88793f/x/slashing/keeper/keeper.go#L1) simply
calls
the staking module’s [Slash](https://github.com/cosmos/cosmos-sdk/blob/5621d9d80736d6025c0f73263947e543fe88793f/x/staking/keeper/slash.go#L37) with
the staking module’s [Slash](https://github.com/cosmos/cosmos-sdk/blob/5621d9d80736d6025c0f73263947e543fe88793f/x/staking/keeper/slash.go#L37) method, passing
among others, the `infractionHeight` (i.e., height at which the equivocation occurred), the validator’s `power`, and
the `slashFactor` (5% in case of equivocation in Cosmos Hub -- see `gaiad query slashing params`).

Expand All @@ -37,18 +37,18 @@ this undelegation is **not** slashed, otherwise the undelegation is [slashed](ht
The slashing of redelegations happens in a similar way, meaning that `Slash` goes through all redelegations and checks on whether
the redelegation started before or after the `infractionHeight`.

We believe that one of the "principles" behind using `infractionHeight` to decide on whether to slash a delegation or redelegation
We believe that one of the ideas behind using `infractionHeight` to decide on whether to slash an undelegation or redelegation
has to do with the fact that we want to slash only delegators whose voting power _contributed_ to the infraction.
However, this is a rather obscure idea because a delegator `D` could start unbonding at height `H` and then a validator
could perform an equivocation before `H` intentionally. In such a case `D` would still get slashed, even though `D`
could intentionally perform an equivocation in the past (before `H`). In such a case, delegator `D` would still get slashed, even though `D`
did not contribute any voting power per se for the equivocation. Furthermore, this principle is not respected in general
(see "Slashing delegations").
(see "Slashing delegations" below).

#### Slashing delegations
Besides undelegations and redelegations, we need to slash simple delegations on the validator.
This is performed by [deducting the appropriate amount of tokens](https://github.com/cosmos/cosmos-sdk/blob/5ca405ae067e7d8df98699f675e060f70a549976/x/staking/keeper/slash.go#L165)
from the validator. Note that this deduction is computed based on the voting `power` the misbehaving validator
had at the time of the infraction. As a result of the tokens deduction,
had at the height of the equivocation. As a result of the tokens deduction,
the [tokens per share](https://docs.cosmos.network/v0.46/modules/staking/01_state.html#delegator-shares)
reduce and hence later on, when delegators undelegate or redelegate, the delegators retrieve back less
tokens, effectively having their tokens slashed. This approach of slashing delegations does not utilize the
Expand All @@ -57,11 +57,12 @@ tokens, effectively having their tokens slashed. This approach of slashing deleg
2. a new delegator `D` delegates to `V` after height `Hi`
3. we receive the evidence of the equivocation by validator `V`
4. we slash the tokens of delegator `D`

In the above scenario, delegator `D` is slashed, even though `D`'s voting power did not contribute to the infraction.


#### Old evidence
In the single-chain case, we never receive old evidence (e.g., from 3 years ago). This is achieved through
In the single-chain case, we never act on old evidence (e.g., from 3 years ago). This is achieved through
[CometBFT](https://docs.cometbft.com/v0.37/spec/consensus/evidence) that filters old evidence based on the parameters
`MaxAgeNumBlocks` and `MaxAgeDuration` (see [here](https://github.com/cometbft/cometbft/blob/ae9826ed75ee411c0d809797ce209ee770c15c4f/evidence/pool.go#L266)).
In Cosmos Hub, the `MaxAgeNumBlocks` is set to 1000000 (i.e., ~70 days if we assume we need ~6 sec per block) and `MaxAgeDuration`
Expand All @@ -70,16 +71,18 @@ is set to 172800000000000 ns (i.e., 2 days). Because of this check, we can easil

### Slashing on the provider
We see that in the single-chain slashing case, we use the `infractionHeight` and the voting `power` to be able to slash.
In order to slash the provider chain for a consumer chain infraction we need to have the provider's `infractionHeight`
and voting `power`. However, we do **not** have those values. We only have the `infractionHeight` in the consumer chain,
but we do not know to what provider height it corresponds. Unless we have a way to find the corresponding `infractionHeight`
and `power` in the provider chain, we cannot slash as we slash in the single-chain case.

Additionally, we make the assumption that the consumer chain could be _malicious_ and hence we cannot trust any
height-to-height correspondence it might inform us about.

This message could be sent through a relayer ...
Note that when we receive evidence through a [MsgSubmitConsumerDoubleVoting](https://github.com/cosmos/interchain-security/pull/1232/files) message,
In order to slash the provider chain for a consumer chain equivocation we need to have the provider's `infractionHeight`
and voting `power`. However, we do **not** have those values. We only have the `infractionHeight` in the consumer chain
(that we can extract from the votes), but we do not know to what provider height this `infractionHeight` corresponds.
Unless we have a way to find the corresponding `infractionHeight`
and `power` in the provider chain, we cannot directly slash in the same way as we slash in the single-chain case.

Someone might think that the problem of figuring out the corresponding `infractionHeight` and `power` values in
the provider chain is easy because we could have the consumer chain send us this information. However, we
consider that the application on the consumer chain could be _malicious_ and hence we
cannot really trust anything that stems from the _application state_ of the consumer chain.

Note that when a relayer or a user sends evidence through a [MsgSubmitConsumerDoubleVoting](https://github.com/cosmos/interchain-security/pull/1232/files) message,
we get what is contained in the [DuplicateVoteEvidence](https://github.com/cometbft/cometbft/blob/ae9826ed75ee411c0d809797ce209ee770c15c4f/types/evidence.go#L36):
```protobuf
type DuplicateVoteEvidence struct {
Expand All @@ -92,25 +95,33 @@ type DuplicateVoteEvidence struct {
Timestamp time.Time
}
```
Note that the "abci specific information" is not useful because they are not signed and hence could be anything. Therefore,
we cannot use the `validatorPower` in any way in our slashing in the provider chain. We can get the `infractionHeight`
The "abci specific information" cannot be trusted because they are not signed. Therefore,
we cannot use the `ValidatorPower` in any way for slashing in the provider chain. We can get the `infractionHeight`
from the votes but this corresponds to the infraction height on the consumer chain. Furthermore, note that the
[Timestamp](https://github.com/cometbft/cometbft/blob/ae9826ed75ee411c0d809797ce209ee770c15c4f/types/vote.go#L55) in the votes
is just the [BFT time](https://github.com/cometbft/cometbft/blob/main/spec/consensus/bft-time.md), and it could be anything
since a misbehaving validator could have included any time in it.
is just the [BFT time](https://github.com/cometbft/cometbft/blob/main/spec/consensus/bft-time.md), and the time
could be anything since a misbehaving validator could have included any time in the votes.

Finally, note that in the single-chain case, we trust the underlying consensus engine (CometBFT) and hence when we receive
evidence from CometBFT we can act on it. In the case of provider and consumer chains however we cannot trust the evidence as is.



#### Clock drift between the provider and the consumer chain
(Still thinking on this but adding here as a first pointer to some ideas)
Conceptually, we have 2 different chains and there’s no guarantee that the clock drift betweeen them is bounded.
One could have BFT time of 2023 and the other from 2013. Having said this however, the chains communicate over IBC and
hence some timing constraints are imposed by IBC otherwise the chains could not communicate. IBC clients contain a
`maxClockDrift` parameter but this is [only used](https://github.com/tendermint/tendermint/blob/ded310093e0d771c9ed27f296921cb6b23d99f29/light/verifier.go#L253-L256)
to reject headers that come slightly from the future. For example, if we assume `maxClockDrift` is 10 seconds in the
light client of a chain `A` and chain `A` has latest block time `12:58:31` and then `A` gets a header for one of its
light clients that has time `12:58:42` that is 11 seconds later, then this client update would get rejected by chain A.
However, even with this constraint we still do not know the real time of the chain `B` it could be 5 years ahead

`trustingPeriod` only tells us from the provider side that we have received a message from the other chain in the

Conceptually, we have 2 different chains and there’s no guarantee that the clock drift between them is bounded.
One could have BFT time of 2023 on one chain and on the other chain a BFT time of 2013.
Nevertheless, the chains communicate over IBC and
hence some timing constraints are imposed by IBC otherwise the chains could not communicate (e.g., light clients could expire).
IBC clients contain a `maxClockDrift` parameter but this is [only used](https://github.com/tendermint/tendermint/blob/ded310093e0d771c9ed27f296921cb6b23d99f29/light/verifier.go#L253-L256)
to reject headers that come slightly from the future. For example, assume a chain `A` that has a light client `lc` that
tracks chain `B` and that has `maxClockDrift` of 10 seconds. If chain `A` has time `12:58:31` and
then `lc` receives a header with time `12:58:42` (that is 11 seconds from the future), then `lc` would reject his client update.
However, even with this `maxClockDrift` constraint we still do not know the time at the chain `B`, e.g., `B`'s BFT time could be 5 years ahead
compared to chain's `A` BFT time.

(TODO: add more on the `trustingPeriod`) only tells us from the provider side that we have received a message from the other chain in the
last `trustingPeriod` (2 weeks in Cosmos Hub) before ... so doesn't add much either ...

We currently investigate on whether we can bound the clock drift between the provider and the consumer chain.
Expand All @@ -123,14 +134,13 @@ In this approach, at the moment we receive evidence, at _evidence height_, we:
2. slash all delegations using as voting `power` the sum of the voting power of the misbehaving validator and the
power of all the ongoing undelegations and redelegations.

**Evidence expiration:** Additionally, in this approach and because of what we explain earlier, because we cannot infer
the actual time the evidence was created in comparison to the provider's time, we we do not consider _evidence expiration_
and hence we can receive evidence from a time in the past (e.g., 3 years ago). As mentioned the timestamp inside a vote could contain
anything.
**Evidence expiration:** Additionally, in this approach and because of what we explained earlier, because at the moment
we cannot infer the actual time the evidence was created (i.e., timestamp in evidence could contain anything), we do not consider
_evidence expiration_ and hence we can receive and act on evidence from a time in the past (e.g., 3 years ago).

### Implementation
The aggressive approach allows for multiple simplifications. Specifically, we can introduce the following snippet
in the [HandleConsumerDoubleVoting(](https://github.com/cosmos/interchain-security/pull/1232) method:
in the [HandleConsumerDoubleVoting](https://github.com/cosmos/interchain-security/pull/1232) method:
```go
undelegationsTotalAmount := sdk.NewInt(0)
for _, v := range k.stakingKeeper.GetUnbondingDelegationsFromValidator(ctx, validatorAddress) {
Expand All @@ -153,24 +163,29 @@ slashFraction := k.slashingKeeper.SlashFractionDoubleSign(ctx)
k.stakingKeeper.Slash(ctx, validatorConsAddress, infractionHeight, totalPower, slashFraction, DoubleSign)
```

**Infraction height:** We provide `infractionHeight` to the [Slash](https://github.com/cosmos/cosmos-sdk/blob/5621d9d80736d6025c0f73263947e543fe88793f/x/staking/keeper/slash.go#L37)
method to slash all ongoing undelegations and redelegations.
**Infraction height:** We provide a zero `infractionHeight` to the [Slash](https://github.com/cosmos/cosmos-sdk/blob/5621d9d80736d6025c0f73263947e543fe88793f/x/staking/keeper/slash.go#L37)
method in order to slash all ongoing undelegations and redelegations (see checks in [Slash](https://github.com/cosmos/cosmos-sdk/blob/5621d9d80736d6025c0f73263947e543fe88793f/x/staking/keeper/slash.go#L107),
[SlashUnbondingDelegation](https://github.com/cosmos/cosmos-sdk/blob/5621d9d80736d6025c0f73263947e543fe88793f/x/staking/keeper/slash.go#L236), and
[SlashRedelegation](https://github.com/cosmos/cosmos-sdk/blob/5621d9d80736d6025c0f73263947e543fe88793f/x/staking/keeper/slash.go#L282)).

**Power:** We pass the sum of the voting power of the misbehaving validator and the
power of all the ongoing undelegations and redelegations. This is a slightly more aggressive approach than just providing
the voting `power` at the evidence height. If we assume that the `slashFactor` is 5%, then the `power` we would pass
would be `validatorPower + 0.05 * totalPower(undelegations) + 0.05 * totalPower(redelegations)`. Hence when the `Slash`
the voting `power` at the evidence height. If we assume that the `slashFactor` is 5%, then the `power` we pass
is `validatorPower + totalPower(undelegations) + totalPower(redelegations)`. Hence, when the `Slash`
method slashes all the undelegations and redelegations it would end up with
`validatorPower + 0.05 * totalPower(undelegations) + 0.05 * totalPower(redelegations) - 0.05 * totalPower(undelegations) - 0.05 * totalPower(redelegations) = validatorPower`
`0.05 * validatorPower + 0.05 * totalPower(undelegations) + 0.05 * totalPower(redelegations) - 0.05 * totalPower(undelegations) - 0.05 * totalPower(redelegations) = 0.05 * validatorPower`
and hence it would slash 5% of the `validatorPower` at evidence height.

**Storing evidence:** As mentioned, we do not expire evidence in this approach. However, because we do not want to
**Storing evidence:** As mentioned, we do not expire evidence with this aggresive approach. However, because we do not want to
slash the same validator more than once for the same infraction, we have to store the evidence in some _cache_ in order
to check if we have already slashed based on received evidence. We could do the following:
to check if we have already slashed based on received evidence. We could do something similar to the following:

```go

key := hashEvidence(dve DuplicateVoteEvidence) // would only use the signed VoteA and VoteB to generate the hash from the evidence
// hashEvidence should only use the signed parts of the evidence to generate the hash
// therefore, it should only use VoteA and VoteB to avoid someone changing other parts DuplicateVoteEvidence
// and resubmitting
key := hashEvidence(dve DuplicateVoteEvidence)

if key not in cache {
store (key, validatorAddr) in cache
Expand All @@ -181,30 +196,30 @@ if key not in cache {
```
To prevent this cache from growing arbitrarily big, we can introduce some additional checks. For example, if the evidence
is for a validator that is not in the active validator set we can skip from storing it in the cache, etc.
Cleaning up of the cache could also be periodically performed by checking if the validator `validatorAddr` of a cache entry
is still in the active set, if not we remove that entry from the cache.
We can also periodically clean up the cache by checking if the validator `validatorAddr` of a cache entry
is still in the active set and if not we remove that entry from the cache.

### Positive
With the proposed approach we can quickly implement slashing functionality for consumer chains on the provider chain.
This approach does not need any changes in the `staking` module and therefore does not change in any way how slashing is performed
With the proposed approach we can quickly implement slashing functionality on the provider chain for consumer chain equivocations.
This approach does not need any changes in the staking module and therefore does not change in any way how slashing is performed
today for a single chain.

### Negative
We _definitely_ slash more when it comes to undelegations and redelegations because we slash all of them.
We _definitely_ slash more when it comes to undelegations and redelegations because we slash for all of them without
considering an `infractionHeight`.
We _potentially_ slash more than what we would have slashed if we knew the corresponding `votingPower` in the provider
chain.

### Neutral



## Can we expire the evidence?
This is orthogonal to the above solution and describes a way to check whether evidence has expired or not.
Lamport clocks.
This is _orthogonal_ to the above solution and describes a way to check whether evidence has expired or not.

(TODO: Add Josef's idea on using Lamport clocks to expire old evidence.)

## References

> Are there any relevant PR comments, issues that led up to this, or articles referenced for why we made the given design choice? If so link them here!

* {reference link}
## References
* [feat: add handler for consumer double voting #1232](https://github.com/cosmos/interchain-security/pull/1232#event-10206162750)
* [Cryptographic equivocation slashing design](https://forum.cosmos.network/t/cryptographic-equivocation-slashing-design/11400/1)
* [Update client may cause "new header has a time from the future" chain error #1445](https://github.com/informalsystems/hermes/issues/1445)

0 comments on commit a2d180e

Please sign in to comment.