Coding Standards

Foreword

This file contains agreed-upon coding standards and best practices, as well as proposals for changes or new standards. Also, the concept and topics are shamelessly stolen from the Adrestia Wiki (thanks guys!).

Proposals are prefixed with [PROPOSAL] until discussed and voted on by the Hydra team. To be accepted, a practice should be voted with majority + 1, with neutral votes counting as positive votes.

Each proposal should start with a section justifying the standard with rational arguments. When it makes sense, we should also provide good and bad exmaples to make the point clearer.

General

Use Red Bin to track and address "defects"

Use a Red Bin to quickly identify, analyse, track and resolve defects in the process.

When a defect is spotted, place a sticky in the Red bin area of our Miro board. A "defect" is anything that goes in our way: Inconsistencies in formatting, long build times, flacky tests, unclear names, ill understood code or tool...
If there's already a similar issue, add a 👍 emoji to the sticky
Work on the defect as soon as possible
optional: Link to an entry in Logbook explaining the issue and if possible analysing it: Why it matters, what possible solutions there are

Why

Identifying and resolving defects quickly and relentlessly is a key practice to improve quality of both the process and the product

Git

Commit message format

Here are seven rules for great git commit messages:

Separate subject from body with a blank line
Limit the subject line to 50 characters (soft limit)
Capitalize the subject line
Do not end the subject line with a period
Use the imperative mood in the subject line and suffix with ticket number if applicable
Wrap the body at 72 characters (hard limit)
Use the body to explain what and why vs. how

Why

Git commit messages are our only source of why something was changed the way it was changed. So we better make the readable, concise and detailed (when required).

Haskell

Use automatic code formatting

In general, we will use a code formatting tool and enforce it's usage by CI or commit hooks. Refer to the appropriate configuration file in our repository. Any change to which tool and configuration is used shall be discussed and voted upon in the corresponding pull request. Things which are not covered by our tool of choice are also mentioned and voted on in here.

Why

Focus on content and logic when writing and reviewing code

Avoid discussions (and bikeshedding) code format (e.g. indentation) on pull requests

Use only a single blank line between top-level definitions

ℹ️ Should be enforced by code formatting tool (e.g. Fourmolu).

A source code file should not contain multiple consecutive blank lines.

Use only a single blank line between the following top-level definitions:

function definitions
data type definitions
class definitions
instance definitions

Why

Consistency with other Haskell code.

Excessive vertical space increases the amount of unnecessary scrolling required to read a module.

Examples

-- BAD
newtype Foo = Foo Integer
    deriving (Eq, Show)



newtype Bar = Bar Integer
    deriving (Eq, Show)

-- GOOD
newtype Foo = Foo Integer
    deriving (Eq, Show)

newtype Bar = Bar Integer
    deriving (Eq, Show)

-- BAD
instance FromCBOR Block where
    fromCBOR = Block <$> decodeBlock



newtype BlockHeader = BlockHeader
    { getBlockHeader :: Primitive.BlockHeader
    } deriving Eq

-- GOOD
instance FromCBOR Block where
    fromCBOR = Block <$> decodeBlock

newtype BlockHeader = BlockHeader
    { getBlockHeader :: Primitive.BlockHeader
    } deriving Eq

Naming record fields

We do use short and concise names for fields in record data types.
We use DuplicateRecordFields when needed to avoid conflicts.
We use type annotations, getField or generic-lens to disambiguate record field access.

Why

Use the most suitable name for record field without restrictions.

This allows to derive instances for pretty-printing, serialization etc. with minimal amount of boiler-plate.

Avoid redundancy and errors because of it

Examples

-- GOOD
data Foo = Foo
  { name :: Integer
  , value :: Value
  }

-- BAD
data Foo = Foo
  { fooName :: Integer
  , fooValue :: Value
  }

-- BAD
data FooBarBaz = Foo
  { _fbbName :: Integer
  , _fbbValue :: Value
  }

Avoid boolean blindness

Do not pass Bool as arguments to functions if it can be avoided. Alternatives are (also see examples):

Make it two functions
Create a newtype around Bool (We do want to avoid that as well though)
Create a dedicated data type

Why

Side-steps the type system and misses the opportunity to catch errors due to mixing multiple Bool arguments

Helps in documenting what's going on (we can still add documentation of course)

Examples

-- WORST
client :: Bool -- ^ Adding documentation is not enough
       -> IO ()
client pipelined =
  -- Naming ^ the argument is not enough
  pure ()

-- BAD
type IsPipelined = Bool

client :: IsPipelined -> IO ()
client pipelined =
  -- Naming ^ the argument is not enough
  pure ()

-- BAD
newtype IsPipelined = IsPipelined Bool

client :: IsPipelined -> IO ()

-- GOOD
clientSimple :: IO ()

clientPipelined :: IO ()

-- GOOD
data IsPipelined = Simple | Pipelined

client :: IsPipelined -> IO ()

Prefer ticked constructors in type signatures

For user-defined kinds, we want to use ticked constructors in type signatures.

Why

Clearly identifies that a data-type is being promoted as kind.

Examples

data FireForget msg where
  StIdle :: FireForget msg
  StDone :: FireForget msg

-- BAD
fireForgetClientPeer :: Peer (FireForget msg) AsClient StIdle m a
fireForgetClientPeer = ...

-- GOOD
fireForgetClientPeer :: Peer (FireForget msg) 'AsClient 'StIdle m a
fireForgetClientPeer = ...

Use camel-case for names, including acronyms

Use camel-case for all identifiers, whether functions or types, including acronyms.

Why

Not-quite-camel-case identifiers introduce irregularity in names which might hamper interoperability (eg. generating snake-case identifiers from camel-case)

Examples

-- BAD
data UTXO = ...

data UTxO = ...

-- GOOD
data Utxo = ...

[PROPOSAL] Error handling

Distinguish between errors which are common and errors which are exceptional, e.g. validating something fails "as likely" as it will not, whereas sending a message over the network is expected to work usually.

Here are some condensed rules we try to follow:

Total functions

Try to avoid errors in the first place "by design", e.g. using smaller type like Natural or NonEmpty lists (Source)
Avoid error and it's derivates at all cost! Add a huge comment if you still need to use it.

All (pure) functions must be total so there's no use for error or panic except as a placeholder or Test Double to remind one there's more test to write. The only reason why one would use panic in production code is to satisfy the completeness checker of the compiler even though we know some case is impossible, but then we should think very hard of a better solution.

Domain Errors

Use Either e a for expected errors (or even Maybe if there is only one error case)
Use m (Either e a) for the same reason in monadic code
We want to avoid ExceptT e IO and thus also in MonadIO m situations:

ExceptT is bad because:

It's inefficient
It makes code more complicated
It does add any form of safety as the undelrying IOs can still throw exceptions anyhow

Also, make sure e is a proper error type with relevant information, possibly even an instance Exception MyError. And have ToJSON and FromJSON instances to be able to shove the error's information in a properly formatted log trace.

And quoting Edward Yang:

canonicalizing errors that the libraries you are interoperating [with are exposing] is a good thing: it makes you think about what information you care about and how you want to present it to the user. You can always create a MyParsecError constructor which takes the parsec error verbatim, but for a really good user experience you should be considering each case individually.

Exceptions

Exceptional situations may be reported using MonadThrow m and Exception instances
Add documentation about what additional exceptions may be thrown (and when) by a function.
Use safe-exceptions to avoid confusion of synchronous and asynchronous exceptions.
Use bracket or ResourceT for resource management in presence of exceptions.
Make sure you do use throwIO and not throw if you are in the IO monad, since the former guarantees ordering; the latter, not necessarily.

Examples

-- BAD
takeN :: Integer -> [a] -> [a]

-- BAD
takeN :: Integer -> [a] -> Maybe [a]

-- GOOD
takeN :: Natural -> [a] -> [a]

-- GOOD
complexAlgorithm :: Input -> Either ValidationError Output

-- GOOD
complexAlgorithm :: MonadError ValidationError m => Input -> m Output

-- BAD
complexAlgorithm :: (MonadError ValidationError m, MonadIO m) => Input -> m Output

-- BAD
myFunction :: FilePath -> ExceptT MyException IO Handle

-- BAD
myFunction :: MonadIO m => FilePath -> ExceptT MyException m Handle

-- GOOD
-- | Does throw 'MyException', when ... (and any other type of 'Exception' really)
myFunction :: (MonadThrow m, MonadIO m) => String -> m Int

References

Michael Snoyman's Exception Best Practices,
Matt Parson's posts Type Safety Back and Forth,
Mark Karpov's article on exceptions,
Edward Yang's 8 Ways to report errors revisited,
which is based on Eric Kidd's 8 ways to report errors in Haskell,
Another Snoyman's post on The Use of Void to express "impossible" states

Constraints grouping

When a type declaration includes more than one constraints, use tuple notation instead of double-arrow notation.

Parenthesis are not needed when there is a single constraint.

Examples

-- BAD
processEffect ::
  MonadAsync m =>
  MonadTimer m =>
  MonadThrow m =>
  HydraNode tx m ->
  Tracer m (HydraNodeLog tx) ->
  Effect tx ->                                                                                                                 Generate signature comments
  m ()

-- GOOD
processEffect ::
  (MonadAsync m, MonadTimer m, MonadThrow m) =>
  HydraNode tx m ->
  Tracer m (HydraNodeLog tx) ->
  Effect tx ->                                                                                                                 Generate signature comments
  m ()

-- GOOD
processEffect ::                                                                                                               Generate signature comments
  ( MonadAsync m
  , MonadTimer m
  , MonadThrow m
  ) =>
  HydraNode tx m ->
  Tracer m (HydraNodeLog tx) ->
  Effect tx ->
  m ()
processEffect HydraNode{hn, oc, sendResponse, eq} tracer e = do

-- GOOD
createEventQueue :: MonadSTM m => m (EventQueue m e)                                                                               Unfold createEventQueue

[PROPOSAL] Generic Instances

For all data types which are part of an interface to a node, provide generic instances of the following typeclasses:

Eq, Showfrom stock deriving,
Generic for data types declaration,
ToJSON, FromJSON, ToCBOR, FromCBOR,
Arbitrary using generic-random package.

Also provide tests for (de)serialisation using hspec-aeson-golden and quickcheck-classes

Rationale

Eq and Show are needed for test assertions and generally useful for pattern matching, guards, predicates...
Using any kind of automatic derivation (Generic, stock, anyclass...) prevents various kind of runtime errors like missing cases
Having an Arbitrary instance for all base types does not prevent using specialised ones but makes it easy to use those for generating arbitrary test data thus improving coverage of tests

[PROPOSAL] Partial fields on sum types

We do allow the use of record fields on sum types as they make the code clearer and allow more often the use of Generic instances for e.g. JSON serialization.

Hence, we do not activate the -Wpartial-fields warning for our packages.

However, this requires some discipline in using these data types and we never use field accessors as functions and rather pattern match on the field names or use techniques like record puns. Also the DuplicateRecordFields language pragma shall be used to allow re-use of field names.

Rationale

Field names can be self-documenting
Generic instances get better and are more often sufficient
We never have been bitten by the problem

Examples

-- BAD
data Effect tx
  = ClientEffect (ServerOutput tx)
  | NetworkEffect (Message tx)
  | OnChainEffect (OnChainTx tx)
  | Delay DiffTime (Event tx)
  deriving stock (Generic)
  deriving anyclass (ToJSON, FromJSON)

-- GOOD
data Effect tx
  = ClientEffect {serverOutput :: ServerOutput tx}
  | NetworkEffect {message :: Message tx}
  | OnChainEffect {onChainTx :: OnChainTx tx}
  | Delay {delay :: DiffTime, event :: Event tx}
  deriving stock (Generic)
  deriving anyclass (ToJSON, FromJSON)

[PROPOSAL] Favor Lambda-case

When pattern-matching on the last argument of a function, use Lambda-case expressions instead of multiple equations.

Rationale

Especially when the number of cases is important, eg. more than a handful, repeating a function's name and arguments obscures the intention of the code.
When not all arguments are used in all branches of the pattern, compiler requires to use wildcards for unused names which decreases readability of the code.

Examples

-- BAD
 monitor metricsMap (Node (ProcessingEvent _ (NetworkEvent (ReqTx _ tx)))) = do
   ...
 monitor metricsMap (Node (ProcessedEffect _ (ClientEffect (SnapshotConfirmed snapshot)))) = do
   ...
 monitor metricsMap (Node (ProcessedEvent _ _)) =
   ...
 monitor _ _ = pure ()

 -- GOOD
 monitor transactionsMap metricsMap = \case
    (Node (ProcessingEvent _ (NetworkEvent (ReqTx _ tx)))) -> do
      ...
    (Node (ProcessedEffect _ (ClientEffect (SnapshotConfirmed snapshot)))) -> do
      ...
    (Node (ProcessedEvent _ _)) -> ..
      ...
    _ -> ..

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Coding Standards

Foreword

General

Use Red Bin to track and address "defects"

Git

Commit message format

Haskell

Use automatic code formatting

Use only a single blank line between top-level definitions

Naming record fields

Avoid boolean blindness

Prefer ticked constructors in type signatures

Use camel-case for names, including acronyms

[PROPOSAL] Error handling

Total functions

Domain Errors

Exceptions

References

Constraints grouping

[PROPOSAL] Generic Instances

[PROPOSAL] Partial fields on sum types

[PROPOSAL] Favor Lambda-case

Clone this wiki locally