# Auditing & Formalising Hydra Protocol ## Problem Statement We want to: * Have a common language between researchers and engineers, in order to ensure the safety guarantees stated in the paper are actually provided by the implementation, * Help with the auditing and certification process from third-party, * Validate changes to the protocol introduced by the implementation or by new feature w.r.t. to the specification, * Provide strong evidence to users Hydra is actually safe and can be entrusted with their funds. To this end, we are looking for a way to formalise the Hydra Head Protocol ## Current Situation ### Specification In the [original Head paper](), Various desired properties are stated (in section 6.1) within an "experiment" setting describing the behaviour of the system in the presence of an _adversary_ with some variations in the degree of corruption possible: * Consistency * (Conflict-Free) Liveness * Soundness * Completeness Those properties link all three specifications presented above and in essence state the _requirements_ the Head protocol needs to meet. Note that only the `Close`, `Contest` and `Final` transitions of the on-chain state are considered in the security "game". The (Simple) Hydra Head protocol is specified in pseudocode, defining both: 1. The Off-chain protocol producing consensus about the _Snapshots_ (eg. UTxO) when the Head is open, ![Off-Chain Protocol Example](./off-chain.png) 2. The _Client-to-Head Interaction_ protocol defining the effect external inputs have on the system, ![Client-to-Head Interaction Example](./chi.png) 3. The _On-Chain Validation_ "smart contracts" mixing both validation of inputs (eg. _redeemers_) and production of new state (eg. _datums_). ![On-Chain Validation Example](./ocv.png) The properties are then proven in subsection 6.4. ### Implementation The actual implementation of the Hydra Head protocol in this repository departs from the specification in the following: * The On-Chain validators are implemented in Plutus which requires validators to be _boolean functions_, * The on-chain state change is actually produced off-chain, by the entity crafting the transaction, * The implemented off-chain protocol is the so-called _Coordinated Head Protocol_ which lets the leader be entirely responsible of which ones among the _seen_ transactions shall be part of the produced _snapshot_, * This removes the need for confirming every single transaction hence removes the associated messages exchange in the protocol, * We implement an "optimisation" stated in a footnote on page 21 whereby all messages are broadcasted to all nodes and each node does the confirmation independently from all the individual messages, thus removing another message. The code has an [extensive test suite](https://github.com/input-output-hk/hydra-poc/wiki/Testing-Strategy), comprising unit tests, integration and end-to-end tests, and property tests. For the on-chain validators we have implemented specific property tests that introduce _Mutations_ to otherwise "healthy" transactions in order to test-drive and check each transaction's type behaviour. We are aware we are missing some intermediate property tests checking the state machine transition's logic, both for off-chain and on-chain protocol. ## Analysis We have been investigating various solutions, and discussed with other teams what promising avenues there could be to solve our problem. ### Pseudo-code in Documentation We have experimented with retro-engineering the actual implementation in the pseudo-code notation from the paper, in order to get a sense of how much effort this would imply and whether or not this would be useful: ![Embedded LaTeX](./haddock-latex.png) * The result looks identical to pseudo-code researchers use and is defined close enough to source code it is easier to maintain a high-fidelity transcription over time should the code, or the specification, changes. * The notation is however plain LaTeX and is not amenable to any kind of mechanised analysis, reasoning, or extraction ### Refinement * In a meeting with researchers, we discussed _refinement-based_ approaches where one _derives_ a concrete executable program from an abstract specification by successive refinements of models and proofs that the derived model maintain the invariants or properties of the model it is derived from. * This is how "first-generation" formal methods like [B](https://www.atelierb.eu/en/presentation-of-the-b-method/) or [Z](https://en.wikipedia.org/wiki/Z_notation) worked * [Coq](https://coq.inria.fr/) is the somewhat _de facto_ standard nowadays for this kind of approach as it provides way to extract Haskell code out from Coq-proven functions * Other options exist, among which some are already in use within IOG: [Agda](https://wiki.portal.chalmers.se/agda/pmwiki.php), [Isabelle/HOL](https://isabelle.in.tum.de/) or [TLA+](https://lamport.azurewebsites.net/tla/tla.html) although not all of them provide a way to compile the model down to executable code * We have a [Plutus metatheory](https://github.com/input-output-hk/plutus/tree/master/plutus-metatheory) in Agda so this could form the underlying semantics against which formalising actual contracts' behaviour and prove properties about them ### Quviq * Quviq develops the [quickcheck-dynamic](https://github.com/input-output-hk/plutus-apps/tree/main/quickcheck-dynamic) framework whose usage is detailed in the [Plutus documentation](https://plutus-apps.readthedocs.io/en/latest/plutus/tutorials/contract-testing.html). This framework leverages the `Emulator` infrastructure from `plutus-apps` and thus tailored to DApps that are based on the PAB and `Contract` monad which is not the case for Hydra (we talk [directly](https://github.com/input-output-hk/hydra-poc/blob/master/hydra-node/src/Hydra/Chain/Direct.hs#L471) to the chain) * Quviq has partnered with us to adapt the `Contract` monad to the way we create and observe transactions, so that we can use the quickcheck-dynamic framework to model Hydra protocol and check its properties ### Ledger We had a discussion with Jared and Tim from Ledger team, about the way they approached this problem * In the ledger, the specifications are defined in [LaTeX](https://github.com/input-output-hk/cardano-ledger/tree/master/eras/alonzo/formal-spec) then translated to Haskell as [predicates](https://github.com/input-output-hk/cardano-ledger/blob/master/eras/alonzo/impl/src/Cardano/Ledger/Alonzo/Rules/Utxow.hs#L231) and [transition rules](https://github.com/input-output-hk/cardano-ledger/blob/master/eras/alonzo/impl/src/Cardano/Ledger/Alonzo/Rules/Utxow.hs#L360). * Those rules are then exercised with [property tests](https://github.com/input-output-hk/cardano-ledger/blob/master/eras/alonzo/test-suite/test/Test/Cardano/Ledger/Alonzo/PropertyTests.hs#L66) that generate a complete _chain trace_ (eg. sequence of transactions) and ensure the invariants are preserved * This process has its shortcomings as the researchers have no direct way to relate specifications to actual code We discussed various other options to avoid the desynchronisation between formal specification and code problem: * [OTT](https://www.cl.cam.ac.uk/~pes20/ott/ott-jfp.pdf) is a formalism that allows one to write a single specification and then generate various artefacts from it, including LaTeX and executable code (OCaml for example) * Tim shared a Haskell-centric approach he practiced on some [example code](https://github.com/TimSheard/haskell-work/blob/master/IncrementalLambdaCalculusForMapTake2.hs#L36) whereby: 1. The specification is written as a Haskell _typeclass_ defining an interface, 2. The typeclass embeds `Property`-returning functions like: ```.haskell propZero:: (Show t,Eq t) => t -> Property propZero x = (applyDiff @t) x (zero @t) === x ``` 3. Those properties can be tested against actual implementation by instantiating types: ```.haskell testProperty "propZero Map" (propZero @(Map Int Int)) ``` ### Djed We had a discussion with Jean-Frédéric Étienne and James Chapman on how this problem has been approached in the _Djed algorithmic stablecoin_. * The Djed model defined in the ![paper](https://eprint.iacr.org/2021/1069.pdf) has been formalised in [Lustre](https://www-verimag.imag.fr/The-Lustre-Programming-Language-and) and its properties model-checked using [Kind 2](https://kind.cs.uiowa.edu/kind2_user_doc/index.html). The model-checking process uncovered issues and also led to formulation of more properties. * Those properties have been analysed and enumerated in a document, and then materialised as Jira tickets: ![Djed example property](./djed-property.png) * Then for each of those, a Quickcheck property is implemented ```.haskell {-| Check stablecoin test objective #18 (https://input-output.atlassian.net/browse/DS-124) (use after an unsuccessful buying reservecoin order with insufficient Ada for buying RC response) -} checkSTO18 :: (MonadError StableCoinTestError m, MonadWriter OrderTestLog m, MonadReader DeployedStablecoin m) => OrderSummary -- ^ Summary of the order that was executed -> SCBankState -- ^ Bank state before executing the order -> SCBankState -- ^ Bank state after executing the order -> m () checkSTO18 OrderSummary{boPeggedCurrencyExchangeRate, boOrderTxOuts} oldState _newState = do StableCoinParams{scBaseFee, scDefaultReserveCoinPrice} <- asks (sqsStableCoinParams . dsQueryState) let buyRCPrice = max scDefaultReserveCoinPrice equityPrice equity = R.fromInteger (scReserve oldState) P.- min (R.fromInteger (scReserve oldState)) (boPeggedCurrencyExchangeRate P.* R.fromInteger (scN_SC oldState)) equityPrice = if (scN_RC oldState > 0) then equity P.* R.recip (R.fromInteger (scN_RC oldState)) else P.zero adaFromOrderUtxo (SCOrder{scAction=BuyRC _ totalBuyingPrice}, _, _, _) = Just totalBuyingPrice adaFromOrderUtxo _ = Nothing case boOrderTxOuts of [adaFromOrderUtxo -> Just a] | R.fromInteger a P.* (R.recip $ buyRCPrice P.* (P.one P.+ scBaseFee)) < P.one -> pure () | otherwise -> throwError (PropertyFailed "STO18: Deposited Ada for buying order / ( BuyRCPrice * (1 + baseFee) ) < 1 ") _ -> throwError $ TestFailed "Expected one BuyRC order txout" logTestObjective STO18 ``` * JF insisted on: * The importance of careful "manual" analysis of the specification, the related properties _and_ the code, to ensure consistency and completeness, eg. relying on automation is not enough to ensure correctness of the code w.r.t. the specification, * The benefits of writing the properties, even without much formalism, using plain English and text. ## Synthesis The following picture tries to synthesize a formalisation process for Hydra Head, based on the above discussions. * Dashed arrows represent manual translation or inspection, solid arrows automatic use or derivation, * Rectangle boxes represent _artifacts_ whereas rounded boxes represent processes or tools ![Hydra Formalisation Process](./hydra-formalisation-overview.jpg) * In the short term, we should focus on specifying properties based on the requirements expressed in the paper, and defining suitable generators from a model of the environment * The first and most important step should be to state the expected properties of the Head protocol, both off- and on-chain * The generators should be expressed at the level of the whole behaviour of a Hydra Head, eg. generating complete (prefix) of expectedly valid execution traces, based on _Adversarial_ assumptions and expected usage. Note that this includes a suitable model of the chain * The properties should reflect as faithfully as possible the formal requirements stated in the paper. We could start with the haddock-based approach and explore possible "pretty-printing" of properties to be as close as possible to mathematical notations * In the long run, we should invest in building a formal model of the Hydra Head protocol from which we could refine (parts of) the protocol implementation and formally state and prove expected properties